User Tools

Site Tools


oric:hardware:dsk_disk_format

The DSK file format

The de-facto standard for Oric disk images is the DSK format. Unfortunately this extension covers two different file formats:

  • The older format, which stores only the high level structure of a disk — tracks and sectors. This is adequate for storing standard disks generated by the normal Oric disk operating systems, but cannot be used to archive disks with special formatting or protections. These disks are identified by the ORICDISK signature at the start of the file.
  • The newer format, which uses a byte-aligned representation of the data bits of a track so that a floppy disk controller can identify the various locations on a floppy while it is spinning: headers, synchronisation sequences, checksums, gaps, etc. This permits a greater range of formats to be represented, including simpler protection mechanisms. These disks are identified by the MFM_DISK signature at the start of the file.

It is possible to convert an old style DSK image to the new format with the old2mfm program.

The old (ORICDISK) format

All data is stored in little endian format.

Old DSK files start with a 256-byte header:

  • the 8-byte signature: ORICDISK;
  • the number of sides (32 bits);
  • the number of tracks (32 bits);
  • the number of sectors (32 bits);
  • unused padding data to fill the remainder of the 256 bytes.

For example, a Sedoric 3.0-formatted disk's header would look like:

  • 4F5249434449534B → ORICDISK
  • 02000000 → 2 sides
  • 50000000 → 80 tracks
  • 11000000 → 17 sectors
  • 246 null bytes

Disk data

After the header comes the disk data with following layout:

  • all data of the first side is stored first, then all data for the second side (if the floppy is double sided);
  • all sectors comes in the natural order, track by track, sector by sector, in increasing order;
  • sectors are assumed having a size of 256 bytes.

The new (MFM_DISK) format

All the data is stored in little endian format. In the documentation below, data is indicated in hexadecimal and counts in decimal.

Header

New DSK files start with a 256-byte header:

  • the 8-byte signature: MFM_DISK;
  • the number of sides (32 bits);
  • the number of tracks (32 bits);
  • the geometry type (32 bits);
  • unused padding data to fill the remainder of the 256 bytes.

The padding area should contain null bytes, and is reserved for future extensions.

Geometry

This field has either the value `1` or the value `2`, indicating the track ordering used by the file.

  1. side 0, track 0; side 0, track 1; side 0, track 2 … side 0, track n; side 1, track 0; side 1, track 1; side 1, track 2 … side 1, track n; … ;
  2. side 0, track 0; side 1, track 0; … side n, track 0; side 0, track 1, side 1, track 1 … .

Oric disk images are most likely to use the first type.

For example, a Sedoric 3.0-formatted disk's header would look like:

  • 4D464D5F4449534B → ORICDISK
  • 02000000 → 2 sides
  • 50000000 → 80 tracks
  • 01000000 → Geometry type 1
  • 246 null bytes

Disk data

After the header comes the list of tracks, ordered according to the header's geometry type.

Each track is 6400 bytes long, of which the first 6250 bytes are useful data. The remaining 150 ensure alignment to a multiple of 256 bytes.

Inside of a sector, each byte on the track represents a properly-formed MFM encoding of that byte on disk. Outside of a sector bytes have special meanings: * A1 represents an A1 MFM sync mark — the normal MFM encoding of A1 but with a missing clock bit between bits 4 and 5; * C2 represents a C2 MFM sync mark — the normal MFM encoding of C2 but with a missing clock bit between bits 3 and 4.

MFM_DISK therefore represents disks on which: * all sync marks are byte-aligned and outside of sector data; * data is stored at a uniform rate; * all flux transitions have perfect amplitude and sit exactly in the centre of their windows.

Reconstructing the MFM bit stream from an MFM_DISK therefore involves a partial simulation of a disk controller to keep track of whether each byte is inside or outside of a sector. Emulation of an MFM_DISK requires reconstruction of the bit stream and emulation of a disk controller.

IBM System 34 MFM Format

Track data

A track starts with the pre-index gap:

  • 4E or FF (*80)
  • 00 (*12)
  • C2 C2 C2

Each value of C2 is missing the normal clock transition between bits 3 and 4 and acts as a synchronisation mark.

The pre-index gap is followed by the index mark:

  • FC

And the post-index gap:

  • 4E or FF (*50)

The Oric disk interfaces detect track starts via the index hole — a physical property of the disk — not by sensing the track header; it is therefore optional for all Oric disk operating systems. Sedoric abbreviates the pre-index gap to start with 40 instances of 4E rather than 80 and the post-index gap to contain 40 instances of 4E rather then 50.

Sectors

Sectors are stored as an ID record, identifying the sector that is about to appear, followed by a gap and a data record.

Both ID and data records begin with the sequence:

  • 00 (*12)
  • A1 (*3)

Each value of A1 is missing the normal clock transition between bits 4 and 5 and acts as a synchronisation mark. CRC calculation starts with the first A1; with a standard disk encoding the CRC generator is therefore primed with three copies of A1 before encountering any further data.

ID Records

An ID record has the form:

  • FE
  • Track number
  • Side
  • Sector number
  • Number of bytes per sector (1=256 bytes, 2=512 bytes, 3 = 1024 bytes, 4 = 2048 bytes)
  • 2 bytes CRC

This is then followed by the ID gap, which is:

  • FF or 4E (*22)

Data Records

A data record has the form:

  • FB
  • 256 or 512 bytes with the actual data
  • 2 bytes CRC

FB indicates ordinary data. IBM also permits F8 as an alternative, meaning deleted data. The WD177X used by both the Microdisc and the Jasmin will read track data with either mark and indicate whether the sector was flagged as deleted via its status register.

This is followed by the data gap, which is:

  • FF or 4E (*54)

End-of-track Padding

After completing the final sector, tracks are padding with FF or 00 bytes until full.

Exposition

IBM expected a 3% variation in disk rotation speeds between devices. Soft sectoring — marking sector locations with ID marks — was an evolution of hard sectoring. It was therefore conventional to write the ID marks once, when the disk was formatted, but to write and rewrite sector contents as the disk is used.

The gaps therefore have several purposes:

Room for overrun: a disk drive might replace existing sector data with new data but do so with a slightly slower clock rate than that at which the disk was originally formatted due to having a slightly faster motor. In that case a buffer area is required to prevent the new sector data overwriting the next ID mark.

Room for an imperfect splice: floppy drives have separate read and write heads, a physical distance apart. That distance has a certain tolerance and is fixed, meaning that it's a different number of bits apart on every track. When writing a sector disk controllers look for the corresponding ID mark and then replace the following data. Because of the different heads, when it switches from reading to writing data it will likely write new data at a different location from the data it intends to replace, with an abrupt change in phase. Keeping gaps around the data sections prevents accidental removal of adjacent data.

Time for a phase-locked-loop to adapt: any changes in data density that have accumulated from use in different disk drives and any abrupt changes in phase that result splicing will throw the controller's phase-locked-loop ('PLL') out of synchronisation with the underlying bit stream. Long sequences of 00 produce long sequences of the bit pattern 010101010101 and therefore give the PLL time to calibrate before the appearance of critical data. Each ID and sector begins with a suitably-crafted synchronisation mark to avoid framing errors.

A disk that is intended for read-only use can afford to vary from the standard format and use much tighter gaps between sectors; many disks intended for read and write usage can also use shorter gaps. Real disk controllers look only for appropriate synchronisation marks; they do not count gap lengths.

oric/hardware/dsk_disk_format.txt · Last modified: 2016/12/28 17:38 by thomh