TransWikia.com

Writing to /dev/sdX instead of /dev/sdX1

Super User Asked on December 18, 2021

What happens physically if I want to copy some files to an external hard drive and do cp -a [file] /dev/sdX instead of cp -a [file] /dev/sdX1? Or when wiping a drive by overwriting it with zeros, is there a difference between dd if=/dev/zero of=/dev/sdX and dd if=/dev/zero of=/dev/sdX1? AFAIK, sdX means "the disk" and sdX1 the 1st partition on it. From a computer science perspective, both sdX and sdX1 seem to represent files (or objects in OOP lingo), but I don’t really understand what the difference is from a physical perspective.

PS.: The question is maybe analogous to the question how a drive formatted in FAT is different to one formatted in Ext4. My guess here would be that 99,99% is identical, and that it’s just a tiny file (the partition table?) which is different (formatting a drive usually takes only a few seconds). So the OS (?) will read the formatting information and then proceed to write the data according to a protocol specific to that format.

2 Answers

What happens physically if I want to copy some files to an external hard drive and do cp -a [file] /dev/sdX instead of cp -a [file] /dev/sdX1?

Assuming that you have proper permissions, neither command would write the file in a manner that is readily retrievable.
Both commands would cause the contents of file to be directly written to LBAs (aka sectors) and bypass any filesystem conventions.
No filename, no file size, no owner, no create time, no permissions, no attributes of any kind would be associated with the data that was written.

The first command, with destination /dev/sdX, would sequentially write the file contents starting at the first LBA of the physical drive. This would clobber the MBR (Master Boot Record) and the partition table, and make the contents of all filesystems inaccessible.
The second command, with destination /dev/sdX1, would sequentially write the file contents starting at the first LBA of partition number 1. This would clobber the boot sector of the partition, and likely corrupt the filesystem of this partition, making the filesystem (and all of its files) inaccessible.

In other words the primary difference between these two commands is the LBA (aka sector number) of where the writing starts.


Or when wiping a drive by overwriting it with zeros, is there a difference between dd if=/dev/zero of=/dev/sdX and dd if=/dev/zero of=/dev/sdX1?

The first command, with destination /dev/sdX, would completely zero-out all LBAs of the drive. No information would be readable any more (without some allegedly extraordinary techniques).
The second command, with destination /dev/sdX1, would zero-out all the LBAs of partition number 1. Although the definition of the partition still exists, the filesystem of that partition (and its files) has been overwritten and is no longer accessible. The LBAs outside this partition are not affected.


AFAIK, sdX means "the disk" ...

It refers to a drive.
SSDs and flash drives do not have "disks".

From a computer science perspective, both sdX and sdX1 seem to represent files

Not all operating systems treat devices as files (e.g. MS Windows).

...but I don't really understand what the difference is from a physical perspective.

These device names (actually device nodes) represent the top-level logical layout of the storage device. The convention established by the IBM PC and MS DOS is that a PC mass-storage device is divided into logical partitions. (Flexible disks, aka floppies, are exempt from this convention.)

Each partition is defined by a starting LBA (aka sector number) and an ending LBA, and therefore has a size (number of LBAs or sectors). Each partition is also assigned an identification code for the filesystem type when the partition is formatted (for that filesystem).
Depending on the partitioning scheme (e.g. MSDOS or GPT), there are optional flags and (volume) names.

The entire drive is represented by a /dev/sdX device node. Other than the drive size (i.e. number of LBAs) and a partition table (if any), this device does not have any other salient properties that are user concerns.


PS.: The question is maybe analogous to the question how a drive formatted in FAT is different to one formatted in Ext4.

No, those are not analogous.
Filesystems exist at an abstraction layer above drive partitioning.
The complexity of a filesystem is several orders of magnitude greater than a partitioning scheme. You have also selected two filesystems that are extreme opposites in complexity and features.

My guess here would be that 99,99% is identical, and that it's just a tiny file (the partition table?) which is different (formatting a drive usually takes only a few seconds).

Your guess is incorrect. The duration that the computer expends to perform a procedure is not a reliable gauge of the complexity of that operation.

The ext4 filesystem has file ownership and permissions. FAT does not.

The ext4 filesystem has journaling for robustness from power cuts. FAT does not.

There are more differences.


Addendum

No, those are not analogous. Filesystems exist at an abstraction layer above drive partitioning.

But aren't both the partition tables and the file system specification saved on a very small portion of the drive?

Insisting that comparing different filesystem is analogous to drive versus partition is inane. A partition is contained within the drive; the partition cannot exist without a drive.

Whereas filesystems are mutually exclusive, especially FAT versus ext4 (although the relationship between ext2/ext3/ext4 filesystems is sort of the exception to the rule). Only one filesystem can exist in a partition. You can install one of these filesystems,and then forget about and never use the other.

Your attempt to divert focus to static information stored on the medium (i.e. the partition table and some vague filesystem data that you assume exists somewhere) makes no sense.
The definition of where a partition starts and ends (or what filesystem is to be installed) is formulated first.
Then that information is stored on the medium somewhere and somehow.
The real work begins when the OS has to perform I/O to the partition(s) as files are read and written.

In other words the partition table (which doesn't change while partitions are mounted) and the filesystem structures (which will change/grow from their initialized form installed during the format) are not the salient components of partitions and filesystem (respectively).
Focusing on such data is akin to having a strategy of running a marathon that relies solely on your starting position (i.e. at the front of the crowd), and neglects the next 42 kilometers.

(FYI a long time ago for a job at a UNIX company, I wrote a FAT format utility. So at one point in time I knew exactly what disk formatting entailed.)


Addendum 2

Rather, they're like a rule set for the OS ...

No.
The partition table merely contains (parametric) data that is used by the OS.
The "rules" are implemented by the code that comprises the algorithms of the OS.
These algorithms use data.
Data are not algorithms (or "rules").

The filesystem ID for a partition installs a filesystem handler, which will process all subsequent open, read, write, lseek, and close systemcalls for that mountpoint.
The initialized structures written during format (e.g. empty allocation tables, empty root directory) are updated as the filesystem creates, writes, and deletes files.
The "rules" for performing open, read, write, lseek, and close file operations are implemented in the code of the filesystem handler.
All filesystem I/O operations will pass through the partition layer (to perform the LBA translation).

The pseudocode for a LBA translator (i.e. convert a given LBA (lba) of a filesystem "device" (device_mounted) to the LBA of the actual/physical device) could be:

if (device_mounted == drive)  
    then  
        if (lba >= drive_size)  
            then  reject translation
            else  actual_lba = lba

    else if (device_mounted == valid_partition_number)
        then  
            if (lba >= partition_size)  
                then  reject translation
                else  actual_lba = lba + partition_start

return (actual_lba)

Handling partitions is just a simple mapping.
Filesystems are subsystems that have a user interface, require resources, and perform I/O.
Using some minute commonality is not a sensible basis for comparison.

Answered by sawdust on December 18, 2021

A disk like /dev/sdX can be seen as a number of bytes where the number of bytes matches the disk size. On a disk like /dev/sdX you can place some kind of partition table, but you don't have to place a partition table on your disk, you can simply access your /dev/sdX as an ordinary file if you so wish or you can place a file system directly on /dev/sdX.

With some kind of partition table the beginning of /dev/sdX will be used to describe how different parts of /dev/sdX has been allocated for different partitions /dev/sdX1, /dev/sdX2... As you at least need some space for the partition table each partition will be smaller than the entire disk /dev/sdX.

Again, the partitions may contain some file system or be accessed like an ordinary file.

Exemple:

|---/dev/sdX-------------------------------------------------------------|
| partition table |--/dev/sdX1-------|--/dev/sdX2------------------------|

So if you wipe /dev/sdX you will wipe all partitions and also the partition table. If you wipe /dev/sdX1 you will only wipe that partition.

A file system is some way to use a continous device like a disk or partition in a way which allows you to dynamically allocate parts of that space for different files in a directory structure. As you understand there are many ways to implement a file system which usually creates some kind of linked lists with data.

Answered by Henrik Carlqvist on December 18, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP