The Laser Playback Head of Omniscience

 

paranoia: F.A.Q.

"Suspicion Breeds Confidence!"
--Brazil

[ home | news | faq | download/svn | links/resources | documentation ]


September 12, 2008

For those new to Paranoia and cdparanoia, this is the best, first place to look for information and answers to your questions.

Questions about errors and problems with Paranoia/cdparanoia aren't here; look at the troubleshooting page.

Table of Contents:

  1. Questions about the Paranoia and cdparanoia projects
    1. What is cdparanoia?
    2. Why use cdparanoia?
    3. What is Paranoia?
    4. Is cdparanoia / Paranoia portable?
    5. What is Paranoia's history?
    6. Is cdparanoia/Paranoia related to cdda2wav?
    7. What are the differences between Paranoia versions?
    8. Are there cdparanoia mailing lists for users or developers?
    9. What is Paranoia's current development status?
  2. General questions about cdparanoia and using it
    1. Requirements to run cdparanoia
    2. What drives does cdparanoia support?
    3. I can play audio CDs perfectly; why is reading the CD into a file so difficult and prone to errors?
    4. Does cdparanoia lose quality from the CD recording?
    5. What drives are recommended for use with cdparanoia?
    6. What is with the new cache features in 10.2?
    7. Can cdparanoia detect pregaps? Can it remove the two second gaps between tracks?
    8. Why don't you implement CDDB? A GUI? Four million other features I want?
    9. The progress meter: What is that weird bargraph during ripping?
    10. Do symbols in the progress bar mean a bad rip?
    11. How can I tell if my drive would be OK without Paranoia?
    12. Why do the binary files from two reads differ when compared?
    13. Why does CDParanoia rip files into several formats but not CDDA format?
  3. Questions about 'vintage' cdparanoia and older Linux kernels (before 2.4)
    1. Why should I upgrade to a newer version?
    2. Requirements to run cdparanoia
    3. Does Cdparanoia support ATAPI drives? SCSI Emulation? USB drives? Parallel port drives?
    4. (Linux only) What is the biggest value of SG_BIG_BUFF I can use?


Questions about the Paranoia and cdparanoia projects

What is cdparanoia?

Cdparanoia is a Compact Disc Digital Audio (CDDA) Digital Audio Extraction (DAE) tool, commonly known on the net as a 'ripper'. The application is built on top of the Paranoia library, which is doing the real work (the Paranoia source is included in the cdparanoia source distribution). Cdparanoia reads audio from the CDROM directly as data, with no analog step between, and writes the data to a file or pipe in WAV, AIFC or raw 16 bit linear PCM.

Cdparanoia is a bit different than most other CDDA extration tools. It contains few-to-no 'extra' features, concentrating only on the ripping process and knowing as much as possible about the hardware performing it. Cdparanoia will read correct, rock-solid audio data from inexpensive drives prone to misalignment, frame jitter and loss of streaming during atomic reads. Cdparanoia will also read and repair data from CDs that have been damaged in some way.

Cdparanoia is easy to use and administrate; It has no compile time configuration, happily autodetecting the CDROM, its type, its interface and other aspects of the ripping process at runtime. A single binary can serve the diverse hardware of the do-it-yourself computer laboratory from Hell.


Why use cdparanoia?

All CDROM drives are not created equal. You'll need cdparanoia if yours is a little less equal than others-- or maybe you just keep your CD collection in a box of full of gravel. Jewel cases are for wimps; you know what I'm talking about.

Unfortunately, most rippers cannot work properly with a large number of CDROM drives in the desktop world today. The most common problem is sporadic or regular clicks and pops in the read sample, regardless of options or settings. The great lesson from coding software for CDROMS the past 15 years is that drives that don't have bugs reading digital audio are exceptionally rare. Most drives advertise that they support 'error correcting streaming' or 'perfect reconstruction'. If that's true, why is your music collection full of glitches?

Cdparanoia is also smarter about finding and probing CDDA support from drives. Cdparanoia knows most of the old proprietary CDDA reading command sets from the bad old days and can autodetect them all. Many drives, especially older drives, that do not work at all with other rippers will work just fine with cdparanoia.

Nor will you need to type in PCI or SCSI bus ids to use your cdroms ever again. OK, that's not so common today, but it was a *killer* feature 15 years ago ;-) That alone nearly sent cdda2wav to its grave ;-)


What is Paranoia?

Paranoia is a library that provides a unified, robust interface for Linux applications that want to use CDROM devices.

In addition to Linux device interface unification, the library provides tools for automatically identifying devices, and intelligent handling/correction of errors at all levels of the interface. On top of a generic low-level packet command layer, Paranoia implements high-level error-correcting interfaces for tasks such as CDDA where broken or vastly non-standard devices are the rule, rather than the exception.


Is cdparanoia / Paranoia portable?

Paranoia is Linux only (although it runs on all the flavors of linux. It is not only for i386 or x86_64).

In the past, there was effort to make this library as portable as possible across other OS platforms. As time has worn on, I've come to realize that most cross-platform libraries that try to bring a completely uniform interface to a wide variety of very different OSes:

  1. fail to be all that uniform
  2. limit functionality to some subset of what any given platform can do
  3. introduce additional lyers of bugs
  4. just don't work very well
I'm a Linux user. I'm a Linux developer. As a user and developer, I don't really give a damn about other platforms. I want my apps to be the best Linux apps they can be. Full stop.

Should other developers like to participate on full-featured ports to other platforms, I'm all for full cooperation. I'd like to see those ports take *full* advantage of the target platform, not turn into a halfbaked subset of the features that work on both.


What is Paranoia's history?

Is cdparanoia/Paranoia related to cdda2wav?

Paranoia I/II and cdparanoia began life as a set of patches to Heiko Eissfeldt's '
cdda2wav' application in late 1994. Cdparanoia gained its own life as a rewrite of cdda2wav in January of 1998 as "Paranoia III". Paranoia has had no real relation to cdda2wav since then.


What are the differences between Paranoia versions?

Paranoia I and II were a set of patches to Heiko Eissfeldt's cdda2wav 0.8. These patches did nothing more than add some error checks to the standard cdda2wav. They were inefficient and only worked with some drives. Paranoia III was the first version to be written seperately from cdda2wav in the form of a standalone library.

The last of the previous generation of cdparanoia was cdparanoia III version 9.8 from early 2001, designed for Linux 2.0 through early 2.4. At this point, the project met all its original goals and was declared 'finished'.

Linux kept moving forward, finally unifying CDROM device access across all device types behind a new kernel interface in the 2.6 kernel series (something The last cdparanoia 9.x was 9.8, the last of the 9.x versions in 2001 and was designed to support linux through early 2.4 kernels.

Paranoia IV is an upcoming generation that intends to improve the library API as well as take advantage of new CDROM features that existed on only a few specialist drives five years ago, but are now ubiquitous even in inexpensive models. Where Paranoia III concentrated on bulletproof extraction from good media and reliable extraction from damaged media, Paranoia IV will concentrate on the best possible extraction and correction from even heavily damaged media-- so long as the drive can still recognize the disc.


Are there cdparanoia mailing lists for users or developers?

Yes. In addition to the mailing lists below, read-only Subversion access to Paranoia is availble
here.

What is Paranoia IV's current development status?

Paranoia IV has been on haitus for several years while Xiph.Org concentrated on other projects such as Ogg. It was originally intended to be little more than a more cross-platform update of Paranoia-III that included other devices; although the project drew some outside interest, it's not clear there's any great need for it.

Since Paranoia III was originally declared 'finished' in 1999 (subsequent releases have mostly just been to fix bugs and keep up with kernel changes), a number of features that once existed only on very expensive CDROM drives have become more widespread. In particular, drives today are much more capable of fine-grained reporting of media damage. Thus Paranoia IV is now intended to be an update of Paranoia III that improves error handling and reconstruction ability.


Questions about using Paranoia and cdparanoia

Requirements to run cdparanoia 10

  1. A CDDA capable CDROM drive (in 2008, virtually all CDROM drives are)
  2. Linux 2.0 through 2.2:
    1. kernel support for the particular CDROM in use
    2. kernel support for the generic SCSI interface (if using a SCSI CDROM drive) and proper device (/dev/sg?) files (get them with the MAKEDEV script) in /dev. Most distributions already have the /dev/sg? files.
  3. Linux 2.4, through 2.6:
    1. kernel support for either the generic SCSI of SG_IO interfaces; most modern kernels are built with SG_IO by default.
The cdparanoia binary will likely work with Linux 1.2 and 1.3, but I do not actively support kernels older than 2.0 I do know for a fact that the source will not build on kernel installs older than 2.0, but the problems are mostly related to the ever-changing locations of proprietary cdrom include files.


Does Cdparanoia support ATAPI drives? SCSI Emulation? USB drives? Parallel port drives?

Cdparanoia 9 and 10 support the full ATAPI, IDE-SCSI and SCSI generic interfaces under Linux. Cdparanoia 10 adds SG_IO support, which it the default interface for all modern CDROM drives under linux 2.6.

Note that the native 'cooked' ATAPI driver is supported, but should be considered deprecated; SG_IO and IDE-SCSI emulation both work better with ATAPI drives. This is an issue of control; the other interfaces gives cdparanoia complete control over the drive whereas the native ATAPI driver insists on hiding the device under an abstraction layer with poor error handling capabilities. Note also that a number of ATAPI drives that do not work at all with the ATAPI driver (error 006: Could not read audio) *will* work with SG_IO and IDE-SCSI emulation.

USB drives are fully supported through the SG_IO and SG interfaces.

Parallel port based CDROM (paride) drives are not yet explicitly supported.


I can play audio CDs perfectly; why is reading the CD into a file so difficult and prone to errors? It's just the same thing.

Unfortunately, it isn't that easy.

The audio CD is not a random access format. It can only be played from some starting point in sequence until it is done, like a vinyl LP. Unlike a data CD, there are no synchronization or positioning headers in the audio data (a CD, audio or data, uses 2352 byte sectors. In a data CD, 304 bytes of each sector is used for header, sync and error correction. An audio CD uses all 2352 bytes for data). The audio CD *does* have a continuous fragmented subchannel, but this is only good for seeking +/-1 second (or 75 sectors or ~176kB) of the desired area, as per the SCSI spec.

When the CD is being played as audio, it is not only moving at 1x, the drive is keeping the media data rate (the spin speed) exactly locked to playback speed. Pick up a portable CD player while it's playing and rotate it 90 degrees. Chances are it will skip; you disturbed this delicate balance. In addition, a player is never distracted from what it's doing... it has nothing else taking up its time. Now add a non-realtime, (relatively) high-latency, multitasking kernel into the mess; it's like picking up the player and constantly shaking it.

CDROM drives generally assume that any sort of DAE will be linear and throw a readahead buffer at the task. However, the OS is reading the data as broken up, seperated read requests. The drive is doing readahead buffering and attempting to store additional data as it comes in off media while it waits for the OS to get around to reading previous blocks. Seeing as how, at 36x, data is coming in at 6.2MB/second, and each read is only 13 sectors or ~30k (due to DMA restrictions), one has to get off 208 read requests a second, minimum without any interruption, to avoid skipping. A single swap to disc or flush of filesystem cache by the OS will generally result in loss of streaming, assuming the drive is working flawlessly. Oh, and virtually no PC on earth has that kind of I/O throughput; a Sun Enterprise server might, but a PC does not. Most don't come within a factor of five, assuming perfect realtime behavior.

To keep piling on the difficulties, faster drives are often prone to vibration and alignment problems; some are total fiascos. They lose streaming *constantly* even without being interrupted. Philips determined 15 years ago that the CD could only be spun up to 50-60x until the physical CD (made of polycarbonate) would deform from centripetal force badly enough to become unreadable. Today's players are pushing physics to the limit. Few do so terribly reliably.

Note that CD 'playback speed' is an excellent example of advertisers making numbers lie for them. A 36x cdrom is generally not spinning at 36x a normal drive's speed. As a 1x drive is adjusting velocity depending on the access's distance from the hub, a 36x drive is probably using a constant angular velocity across the whole surface such that it gets 36x max at the edge. Thus it's actually spinning slower, assuming the '36x' isn't a complete lie, as it is on some drives.

Because audio discs have no headers in the data to assist in picking up where things got lost, most drives will just guess.

This doesn't even *begin* to get into stupid firmware bugs. Even Plextors have occasionally had DAE bugs (although in every case, Plextor has fixed the bug *and* replaced/repaired drives for free). Cheaper drives are often complete basket cases.

Rant Update (for those in the know):

Several folks, through personal mail and on Usenet, have pointed out that audio discs do place absolute positioning information for (at least) nine out of every ten sectors into the Q subchannel, and that my original statement of +/-75 sectors above is wrong. I admit to it being misleading, so I'll try to clarify.

The positioning data certainly is in subchannel Q; the point is moot however, for a couple of reasons.

  1. The SCSI and ATAPI specs (there are a couple of each, pick one) don't give any way to retrieve the subchannel from a desired sector. The READ SUB-CHANNEL command will hand you Q all right, you just don't have any idea where exactly that Q came from. The command was intended for getting rough positioning information from audio discs that are paused or playing. This is audio; missing by several sectors is a tiny fraction of a second.

  2. Older CDROM drives tended not to expect 'READ SUB-CHANNEL' unless the drive was playing audio; calling it during data reads could crash the drive and lock up the system. I had one of these drives (Apple 803i, actually a repackaged Sony CD-8003).

  3. MMC-2 *does* give a way to retrieve the Q subchannel along with user data in the READ CD command. Although the drive is required to recognize the fetaure, it is allowed to simply return zeroes (effectively leaving the feature unimplemented). Guess how many drives actually implement this feature: not many.

  4. Assuming you *can* get back the subchannel, most CDROM drives seem to understand audio discs primarily at the "little frame" level; thus sector-level structures aren't reliable. One might get a reassembled subQ, but if the read began in the middle of a sector (or dropped a little frame in the middle; many do), the subQ is likely corrupt and useless.

As reassembling uncorrupted frames is easy without the subchannel, and corrupted reads likely result in a corrupted subchannel too, cdparanoia treats the subchannel as more trouble than it's worth (during verification).

At least one other package (Exact Audio Copy for Win32) manages to use the subchannel to enhance the Table of Contents information. I don't know if this only works on MMC-2 drives that support returning Q with READ CD, but I think I'm going to revisit using the subchannel for extra TOC information.


Does cdparanoia lose quality from the CD recording? Does it just re-record the analog signal played from the CDROM drive?

No to both. Cdparanoia (and all other true CD digital audio extraction tools) reads the values off the CDROM in digital form. The data never comes anywhere near the soundcard, and does not pass through any conversion to analog first.

Can cdparanoia detect pregaps? Can it remove the two second gaps between tracks

Not yet. This feature is slated to appear in Paranoia IV.


Why don't you implement CDDB? A GUI? Four million other features I want?

Too many features spoil the broth. "Software is not perfect when there is nothing left to add, but rather when there is nothing extraneous left to take away." The goal of cdparanoia is perfect, rock-solid audio from every capable cdrom on every platform. As this goal has not yet been met, I'm uninterested in adding unrelated capability to the core engine.

Several GUIs that incorporate cdparanoia already exist; I'm in the process of compiling a list (see the links page). Other software that implements new features by wrapping around cdpar anoia (like CDDB lookup) also exist.

'Cdparanoia' will not play to sound cards (you can always pipe the output to a WAV player), do MD5 signatures, read CD catalog or serial numbers (this *is* a feature I plan to add), search indexes, do rate reduction (use Sox, Ogg or a million others), or generally make use of the maximum speed available from a CDROM drive.

If your CDROM drive is *not* prone to jitter and you don't have scratched discs to worry about, you might want to look at the original cdda2wav for features cdparanoia does not have. Keep in mind however that even the really good drives do occasionally stumble. I know of at least one cdparanoia user who insists on using full paranoia with his Plextor UltraPlex because it once botched a single sector from a rip; he'd already burned the track to several CD-Rs before noticing...


The progress meter: What is that weird bargraph during ripping?

It's a progress/status indicator. There's a completion bargraph, a number indicating the last sector number completely verified of the read currently happening, an overlap indicator, a gratuitous smilie, and a heartbeat indicator to show if the process is still alive, hung, or spinning.

The bargraph also marks points during the read with characters to indicate where various 'paranoia' features were tripped into action.

A
progress bargraph
B
sector counter
C
overlap indicator
D
smilie (gratuitous)
E
heartbeat

The bargraph

Different bargraph characters indicate different things occurred during that part of the read. The letters are heirarchical; for example if a trasport error occurs in the same sector as jitter, the bargraph will print 'e' instead of '-'.

Legend of characters
- A hyphen indicates that two blocks overlapped properly, but they were skewed (frame jitter). This case is completely corrected by Paranoia and is not a cause for concern.
+ A plus indicates not only frame jitter, but an unreported, uncorrected loss of streaming in the middle of an atomic read operation. That is, the drive lost its place while reading data, and restarted in some random incorrect location without alerting the kernel. This case is also corrected by Paranoia.
e An 'e' indicates that a transport level SCSI or ATAPI error was caught and corrected. Paranoia will completely repair such an error without audible defects.
X An "X" indicates a scratch was caught and corrected. Cdparanoia wil interpolate over any missing/corrupt samples.
* An asterisk indicates a scratch and jitter both occurred in this general area of the read. Cdparanoia wil interpolate over any missing/corrupt samples.
! A ! indicates that a read error got through the stage one of error correction and was caught by stage two. Many '!' are a cause for concern; it means that the drive is making continuous silent errors that look identical on each re-read, a condition that can't always be detected. Although the presence of a '!' means the error was corrected, it also means that similar errors are probably passing by unnoticed. Upcoming releases of cdparanoia will address this issue.
V A V indicates a skip that could not be repaired or a sector totally obliterated on the medium (hard read error). A 'V' marker generally results in some audible defect in the sample.

The smilie

The smilie is actually relevant. It makes different faces depending on the current errors it's correcting.

Legend of smilies
:-) Normal operation. No errors to report; if any jitter is present, it's small.
:-| Normal operation, but average jitter is quite large.
:-P A rift was found in the middle of an atomically read block; in other words, the drive lost streaming in the middle of a read and did not abort, alert the kernel, or restart in the proper location. The drive silently continued reading in some random location.
:-/ The read appears to be drifting; cdparanoia is shifting all of its reads to make up for it.
8-| Two matching vectors were found to disagree even after first stage verification; this is an indication that the drive is reliably dropping/adding bytes at consistent locations. Because the verification algorithm is partially based on rereading and comparing vectors, if two vectors read incorrectly but identically, cdparanoia may never detect the problem. This smilie indicates that such a situation *was* detected; other instances may be slipping through.
:-0 Transport or drive error. This is normally not a cause for concern; cdparanoia can repair just about any error that it actually detects. For more information about these errors, run cdparanoia with the -v option. Any all all errors and a description will dump to stderr.
:-( Cdparanoia detected a scratch.
;-( Cdparanoia gave up trying to repair a sector; it could not read consistent enough information from the drive to do so. At this point cdparanoia will make the best guess it has available and continue (a V appears in the bargraph at this point). This often results in an audible defect.
:^D Cdparanoia displays this smilie both when finished reading a track and also if no error correction mechanism has been tripped so far reading a new track.


Do symbols in the progress bar mean a bad rip?

' ', '-', and '+' symbols in the progress bar are harmless; the resulting audio file should have no defects. '!' indicates that cdparanoia is uncertain about having caught all errors; a few are likely harmless, lots indicate a problem. 'V' indicates an error that definately got through, probably an audible error (unless it happened in silence at the edges due to a known bug).


How can I tell if my drive would be OK with regular cdda2wav?

Easy. Run cdparanoia; if the progress meter never shows any characters but the little arrow going across the screen, the CDROM drive is probably one of the (currently) few drives that can read a pristine stream of data off an audio disc regardless of circumstances. This drive will work quite well with cdda2wav (or cdparanoia using the '-Z' option)

A drive that results in a bargraph of all hyphens would *likely* work OK with cdda2wav, but it's less certain.

Any other characters in the bargraph (colons, semicolons, pluses, Xs, etc..) indicate that a fixups had to be performed at that point during the read; that read would have failed or 'popped' using cdda2wav.


(Linux only) What is the biggest value of SG_BIG_BUFF I can use?

[note: This question only applies to the SG kernel interface; cdparanoia 10 prefers the 2.6 kernel's new SG_IO interface. Only users with older kernels care about SG.]

65536 (64 kilobytes). Some motherboards can use 128kB DMA, but attempting to use 128kB DMA on a machine that can't do it will crash the machine. Cdparanoia will not use larger than 64kB requests.


Why do the binary files from two reads differ when compared?

The usual problem is the beginning point of the read. Cdparanoia enforces consistency from whatever the drive considers to be the starting point of the data, and the drive is returning a slightly different beginning point each time. The beginning point should not vary by much, and if this shift is accounted for when comparing the files, they should indeed turn out to be the same (aside from errors duly reported during the read; scratch correction or any reported skips will very likely also result in different files).

It is also possible to get different reads if the media is damaged and cdparanoia is unable to deterministically repair the error.


Why do CDParanoia, CDDA2WAV et al. rip files off into WAV format (and other sample formats) but not CDDA format?

WAV and AIFC are simply convenient formats that include enough header information such that multipurpose audio software can uniquely identify the form of the data in the sample. In raw form, mulaw, SND and CDDA look exactly alike to a program like xplay, and are very likely to blow your ears (and stereo) out when played! Header formats are more versatile and safer. By default, cdparanoia and cdda2wav write WAV files.

That said, cdparanoia (and cdda2wav) will write raw, headerless formats if explicitly told to. Cdparanoia writes headerless, signed 16 bit, 44.1kHz stero files in little endian format (LSB first) when given the -r option, and the same in big endian (MSB) format when given -R. All files written by cdparanoia are a multiple of 2352 bytes long (minus the header, if any) as required by cd writer software.

Cdparanoia and the Laser-Playback-Head-of-Omniscience logo are trademarks (tm) of Xiph.Org. These pages are copyright (C) 1994-2002 Xiph.Org. All rights reserved.
Comments and questions about this web site are welcome.