I’m releasing a tool I wrote for myself: cue2ccd
, a commandline tool to convert CD-ROM disc images from the BIN/CUE format to the CloneCD format. For as many disc image conversion tools as there are out there, I hadn’t found anything open-source or cross-platform that can handle going between these two specific formats—so I wrote it myself.
This is a very niche tool, but it solves one specific problem I have. I own a Rhea optical drive emulator for the Sega Saturn, a device which replaces the original CD drive in the console and allows it to load media from disc images on an SD card instead of physical CDs. The Rhea’s great in a lot of ways, but it has one specific limitation: it doesn’t load games in the BIN/CUE disc image format1. Since a lot of media online is in that format, I’ve really been wanting a convenient way to convert existing BIN/CUE images I have lying around into something I can use. Given how niche this is I don’t expect many other people to need it, but I hope it’s helpful if there’s anyone else in the same situation.
Usage is as simple as possible: just run cue2ccd path_to_cuesheet.cue
and it’ll produce new .img
, .ccd
and .sub
files in the same directory, ready for use. I’ve set up convenient commandline installers for installing it on Mac, Linux, and Windows, which are available from the website, and it can be installed using Homebrew by running brew install mistydemeo/formulae/cue2ccd
.
From here, I’d like to take a little dive into the details of what this kind of conversion looks like and what I needed to do. I’m not planning to go into my specific implementation, but rather I’d like to focus on the details of the formats and the problems I ran into when writing cue2ccd. If you don’t care about the technical details, you can skip the rest of the post (but please enjoy the tool, if you use it!). There are three primary things I needed to handle: writing CloneCD control files (.ccd
), writing subcode data (.sub
), and merging multi-track images.
Writing CloneCD control files
Like I mentioned in a previous post, CloneCD’s table of contents format is lower-level and much more complex than the cue sheets used by BIN/CUE disc images. Here’s a sample cue sheet for a disc image with one data track and two audio tracks:
1 2 3 4 5 6 7 8 9 |
|
These nine lines capture (most of) the essential parts of a CD, without getting into details: it lists which tracks exist (and which files those tracks are stored in); what type and mode each of those tracks are; and that track’s indices, with their locations on the disc.2
The equivalent CloneCD file, meanwhile, is 121 lines long and contains entries that look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
And it continues from there—as you can imagine, it’s a much more complex format to generate! At its core, though, they’re both representing roughly the same information: the table of contents of a disc, with the tracks and their definitions. All of the information I need to generate the CloneCD files either exists in the cue sheet or can be derived based on information I have access to. This data fits into three categories, one of which is data shared in common between cue sheets and the CloneCD format:
- Data about each track, including its list of indices and start/stop timestamps3
- Overall data about the disc and the session (missing from the cue sheet)
- Data about the disc’s lead-in and lead-out sections (missing from the cue sheet)
Track-level metadata
That’s a lot to go over, but this turned out not to be as complex as I thought it might be. I’ll gloss over the disc-level metadata (which is fairly brief); let’s look at what the two formats share in common instead, the track-level metadata. We’ll do direct comparison of the same track from both the cue sheet and the CloneCD file, starting with the cue sheet:
1 2 |
|
Despite being fairly short, it encodes a few different bits of information that we’ll be wanting to reproduce.
- This is track 1 on the disc;
- It’s a data track, specifically a mode 1 data track.4
- That data track is stored in the disc image with “raw” 2352-byte sectors, meaning error correction is included. This field isn’t important for us, since cue2ccd only works with raw disc images.
- This track contains a single index, numbered 1, which begins at the timestamp
00:00:00
—that is, at the very beginning of the disc image.
It’s all, in other words, pretty core structural metadata about the track and how it’s formed. Now let’s take a look at the CloneCD version:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
At first glance, it looks pretty overwhelming! It turns out, however, it’s not actually as complex as it seems. The field names may seem difficult to understand at first flance, but the good news is that they’re based directly on the table of contents from the lead-in on a real CD, and so all of them (with the same or similar names) are documented in the CD spec.
- The
Point
(pointer) field is a hex value which means a few different things depending on context. For a standard track, it’s the track number. In this case, we know from the cue sheet that this is track 1, so it’s set to1
. - The
Control
field is a hex value which indicates information about the track type, along with some other metadata that isn’t relevant to us. This is four bits out of a byte in the CD’s binary format, but CloneCD lets us just write a number. There are only two values that matter to us: audio (0
) or data (4
). We’ve got a data track, so this uses4
. - The track starts at
00:00:00
, so we mark the same values here. They’re just in three separate fields, unlike the cue sheet where they’re written as a single timestamp. We getPMin=0
,PSec=2
andPFrame=0
. (If that seems like an off-by-two value to you, well-spotted. The explanation comes later.) - The
PLBA
field contains essentially the same information as in the Min/Sec/Frame fields, but expressed in terms of the number of sectors since the beginning of the disc’s content. In this case, this track begins at the start of the disc, so that’s 0. - The
AMin
,ASec
andAFrame
values mean something in other contexts, but here are left at zero. - The
Zero
field always contains a0
. What a surprise! - Finally, a few fields aren’t relevant to us and get hardcoded, like
Adr
andTrackNo
.
Whew! In other words, this is mostly the same data as in the cue sheet, it’s just in a more verbose form and using terms that only make sense after reading the CD-ROM spec. Knowing what these fields mean, it wasn’t too hard to generate these CloneCD tracks given the equivalent information from the cue sheet.
Lead-in and lead-out
I mentioned earlier that the CloneCD format includes information about the lead-in and lead-out. These are sections at the beginning and end of the disc that aren’t typically stored, in their raw format, in disc images. The lead-in contains the raw, binary table of contents information for the disc while the lead-out contains information about the disc’s duration.
This is missing from the cue sheet format, but we can derive the info we need from what’s in the CloneCD data. These are stored as “entries” in the CloneCD control file alongside the tracks, and actually looks a lot like track data. The fields share names with the ones used for track data, but some of them take on different meanings when used like this.
To give you an idea what this looks like, here’s an abbreviated copy of the first/last track information for this disc with only the fields that differ from regular track data.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
The Point
field is the POINTER field defined in 22.3.4.2 of the CD-ROM spec. Previously, when talking about tracks, we set this to the track number. When set to a value outside the 1-99 track number range, it means something different. Two of those values can be seen above: 0xa0
means that this entry contains information about the first track on the disc, while 0xa1
means the last track. When set to these values, it changes the meaning of the remaining fields. Instead of containing timing information, the PMin
field is used to specify the track number of the first or last track on the disc, and the other two values are left empty. These two fields tell the player how many tracks to expect when reading the rest of the disc. The PLBA fields are still here, and still calculated based on the Min/Sec/Frame values, but they’re essentially meaningless for these entries since the Min/Sec/Frame aren’t real timestamps.
Finally, we get to the lead-out, which looks like this (relevant fields only):
1 2 3 4 5 |
|
A pointer of 0xa2
indicates that the remaining values are describing the beginning of the disc’s lead-out—or, in other words, describing the end of data. Here, the Min/Sec/Frame values are a timecode again, but instead of describing the start of a section of data, they describe the timestamp marking the end of the disc. (Yes, 12.21 seconds is accurate; this is a small test image containing three seconds-long tracks.) This is actually pretty critical info: it tells the CD player when it should stop seeking at the end of the CD, and makes it possible to tell how long the disc is as a whole.
Parsing and oddities
I went for libcue for parsing cue sheets, since it provides a simple and straightforward track-oriented interface which makes it easy to query all of the track definitions. Writing my own parser in Rust felt out of scope. There are a couple of pure-Rust parsers on crates.io, but they’re oriented around music files like FLAC and are missing a few features I’d need for raw disc images. Instead, I wrote a small crate that acts as a thin binding for libcue while adapting a few bits of its interface to Rust conventions.
One of the more annoying gotchas of the cue sheet format is that it leaves out one important piece of information that’s necessary to render the lead-out entry. Let’s take another peek at the cue sheet, and see if it jumps out at you.
1 2 3 4 5 6 7 8 9 |
|
It lists where tracks and indices start… but it doesn’t show where they end. libcue calculates track ends for every track except the last by checking where the next index starts, and returns that with the rest of the information that’s in the file, but the duration and endpoint of the final track is left completely ambiguous. The only way to get that information is to check the file size of the actual underlying disc image file and calculate how many sectors long it is. It’s not the end of the world, but it is annoying—and it’s the one and only bit of metadata generation I did that required access to the underlying data files. I would have loved if I could have worked just off of the metadata.
Another interesting gotcha is the timestamps, which have an unusual off-by-150 problem. As I mentioned previously, the lead-in and lead-out sections are usually omitted from the binary content of a disc image. Since the lead-in takes up the first 150 sectors on the disc, this means that standard disc images actually start at index 150 into the disc, not index 0. This gives us an conundrum for absolute timestamps. Although the BIN/CUE images appear at first glance to have absolute timestamps that are comparable with the CloneCD file, its definition is slightly different.
With a single BIN file, a cue sheet’s indices are absolute indices into the BIN file. Since the first index within the BIN file is actually sector 150 on the disc, it means that the timecodes for that BIN file are offset from the real CD by 150. Let’s take another look at some absolute timestamps for the two formats for a practical example:
1 2 |
|
1 2 3 |
|
This track on our sample image begins at 00:06:16 into the BIN/CUE… which means that, for CloneCD, it has an absolute timestamp of exactly two seconds more, 00:08:16. In practice, applying an offset when translating timestamps wasn’t actually that hard, but it was a place where where errors seeped in. For a nontrivial part of my tool’s life, I had an off-by-one error from sloppy timestamp conversion.
Generating subcode data
The second thing I needed to create was subcode data (aka subchannel data), a form of builtin metadata used on CD. On physical CDs, each 2352-byte sector is accompanied by 98 bytes of subcode data. The subcode data is necessary when reading a physical CD but not typically needed when mounting or burning a disc image, so a number of disc image formats—including BIN/CUE and plain ISO files—don’t bother reading or saving it at all. The CloneCD format does back it up, however, and the device I’m using requires valid subcode data. I knew I’d need to generate it myself.
Subcode data is a binary format encoding very similar information to the entries we just saw in the text-based CloneCD control format above. Each 98-byte subcode sector contains two bytes of synchronization words, followed by 96 bytes of data divided into eight channels with lettered names from P to W. In the original CD and CD-ROM specs, only the P and Q channels are specified; channels R through W were set aside for later expansion, and most discs never use them. They were used for standards such as CD-TEXT, which allowed encoding human-readable track names on a CD; CD+G, which allowed encoding simple graphics, such as on karaoke CDs; and various copy protection systems. For my usecase, none of those were relevant, so I only needed to generate data for the P and Q channels.
P channel
The P channel was by far the simplest, and took very little work to do. It’s used to indicate the boundaries between tracks for very primitive early players which didn’t keep track of table of contents information. If a sector is within the first 150 sectors of the start of a track, it’s filled with FF
bytes. Otherwise, it contains 00
bytes. There’s no other variation, so it was very easy to implement.
Q channel
The Q channel is slightly more complex. Before getting into the details, let’s look at a little sample of what a single Q channel sector looks like. Here’s the raw bytes in hex format:
1
|
|
There’s a chance you may be able to put together some of this based on the description of the entries in a CloneCD control file earlier, but don’t worry, we’ll come back to this later.
This channel primarily consists of timing information: it encodes the timestamp of the currently-playing sector, a flag indicating whether this sector is data or audio, and some simple forms of metadata5. It also contains a 16-bit checksum, allowing the data in the rest of the Q channel to be validated. The metadata in question isn’t relevant to my usecase, so I only needed to worry about the timestamps, the data flag, and the checksum.
Control and q-Mode fields
The first byte is separated into two four-bit fields. That is, it contains data which is smaller than one byte—an idea that isn’t always familiar to people who aren’t familiar with binary data. Since a byte contains eight bits, it’s possible to fit multiple fields into a single byte if they’re smaller than one byte. In this case, instead of using the full byte for one field, we can split that one byte in half and use it to store two four-bit fields.
The first of these fields, the control field, consists of a few different flags, but only one is relevant here: the data flag. When unset, it indicates that this sector contains audio; when set, it indicates that it contains data. In our case, that means taking the first four bits of our byte and setting them to 0100
.
The second field indicates the type of data being encoded in the following bytes. Since I’m ignoring the alternate metadata that could be represented here, I always set it to the value indicating that the bytes to follow will contain timing information. In our case, that means taking the last four bits of our byte and setting it to 0001
. Putting it all together, we get a byte with the bits:
1
|
|
Or, read as a single byte:
1
|
|
Timestamps
As with the CloneCD control file, timestamps are stored as separate minute, second and fraction fields. The Q channel contains two different timestamps and some other timekeeping information:
- The track number
- The index number
- The timestamp relative to the current track
- The absolute timestamp
All of these values are stored in binary-coded decimal (BCD) format, which has the side bonus that it makes this data easy to read by eye with a hex editor. I made use of that while debugging.
For the most part, these timestamp fields are straightforward to implement so long as I pass the right data in. There was one fun gotcha, however. CD audio contains gaps between tracks called “pregaps”; they’re defined as index 0 within a track, with the track itself beginning at index 1. They throw an interesting edge case for calculating relative timestamps. What does it mean to track the timestamp relative to the start of the track for a time that isn’t part of the track? Since this binary-coded digital format doesn’t support negative numbers, the standard uses a slightly strange but appropriate workaround. Within the pregaps, the relative timestamp instead starts at the length of the pregap and then counts down until it hits 0, which marks the beginning of the track, at which point it begins counting up again. Needless to say, this was the source of a few fun off-by-one bugs.
Checksum
Finally, it ends with a 16-bit (two-byte) checksum of the remainder of the data. The CRC-16 routine it uses is specified in the CD-ROM spec; I generated a suitable C CRC-16 routine using the Ruby crc library, then translated it into Rust. I’ve published it standalone as the cdrom_crc crate.
Putting it all together
Here’s that raw data again, with each byte annotated:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Not actually that much information, and not too hard to make sense of after taking the time to assemble everything, but it certainly took some work to get there.
Luckily for me, the CloneCD representation of subcode data is simplified in a few ways that made things easier. CloneCD ignores the two sync bytes, storing only the 96 data bytes, which saved me the trouble of handling them. It also reorders the data to be easier to reason about. On a physical CD, the subcode for a sector isn’t contiguous. Instead, every 32-byte frame of a data sector is followed by a single byte containing one single bit from each of the eight channels. Assembling a complete byte for the channels requires waiting for eight frames and reordering the bits as they come in. CloneCD, meanwhile, reorders the data into the standard byte order. There may be technical reasons why this is the case when streaming from a CD, but I’m just grateful to get to write bytes like a normal person.
Merging disc images
I actually had a version of cue2ccd ready to release about a year ago, but I had one last feature I really wanted and kept putting off: merging disc images.
More specifically, I wanted to handle disc images containing multiple files. A lot of BIN/CUE disc images use a single BIN file containing all tracks, sort of like how a CD itself is structured, and that’s what the initial version of cue2ccd was written for. In recent years, however, split images have become more common. These are still raw images, but they use separate raw disc image files for every track on the disc. In theory, doing this is easy; the data is the same, you just need to concatenate the files. No work at all. Unfortunately, the metadata is a bit harder. Let’s take a look at the disc from earlier in its original one-file version:
1 2 3 4 5 6 7 8 9 |
|
Now let’s take a look at the exact same disc, but in a one-file-per-track form:
1 2 3 4 5 6 7 8 9 10 11 |
|
It may strike you that those timestamps aren’t useful. And you wouldn’t be entirely wrong. They’re all the same now! What the heck? What happened?
Well, as I (briefly) mentioned earlier, the timestamps in a cue sheet are timestamps into that file, not absolute timestamps into the disc. For a single-file disc image there’s almost no difference between the two, except the off-by-150 issue I mentioned previously. But if a single binary also contains a single track, it suddenly becomes a lot more obvious that the offsets for each track are specific to each file.
So, in practice, implementing this didn’t just mean concatenating the files. It also meant, for each track, keeping track of the size of the disc up until that point so that I could convert each of these relative timestamps into an absolute one. It’s not necessarily hard work but it’s an easy source of off-by-one errors and other similar mistakes, so I had a few revisions with subtly wrong timing. It also runs into a harsher version of the “no duration of the last track” problem: since every track is its own file, now every track is the last track in its file, so none of them have durations available from the metadata. I was able to apply what I’d already written to calculate the duration based on the filesize, with a fix for a bug that only happened when it wasn’t the last track in a larger file, but I’d certainly have preferred not to have to do it at all.
In conclusion: CD is weird
Honestly, it’s been fun to get to dig deeper into a format not many people still care about these days. I’d also like to thank a couple of people whose help with previous projects was very useful for this one: the creator of the Rhea, Phoebe and GDEmu hardware, who was gracious in providing support debugging my earliest attempts at generating files compatible with his hardware; and CyberWarriorX, with whom I worked on an earlier CloneCD-generating project.
-
It also supports a few other formats, such as DiscJuggler and Alcohol 120%, but there aren’t any open-source tools to convert to those either.↩
-
Each track is divided into one or more indices. Index 1 is the actual start of the track, while index 0 defines a gap that comes before the actual track begins, and indices 2 and beyond are rare. The gap between tracks is typically called a “pregap”. On a real CD player, when picking a track by number, the player will start straight from that track’s index 1. When letting the disc play through from a previous track, however, the disc will play the pregap defined in index 0 first before proceeding to index 1.↩
-
Since CD was originally designed just for music, all indices to locations on the disc are measured in terms of timestamps instead of a more data-oriented index like an address in bytes. These timestamps are stored in three parts: minutes, seconds, and 1/75 fractions of a second. For example, if a track starts at two seconds into the disc, its timestamp is 00:02:00. libcue translates these into a logical block address, eg a number of sectors, which would mean the previous example is 150. The CloneCD format reproduces the original CD-ROM spec’s timestamps, but additionally stores logical block addresses in some places for convenience.↩
-
There are a few different modes of data track which have different data layouts. A data sector is always 2352 bytes with a mixture of data and error correction data. The different modes have different ratios of data to error correction. Mode 1, the original and most common mode, uses 2048 bytes out of every sector for data with the remaining 304 bytes serving as error correction.↩
-
It’s also used in the disc’s lead-in and lead-out, but I’m not dealing with those sections of the disc.↩