The Demery Chronicles (no.1)
Dreams do come true
As a music lover, it had always been my desire to work in the music industry. I did some flirting with the idea of joining a studio while at school and then later at university, but, somehow, I never made the jump, and ended up instead working on the GSM digital cellular telephone system! While I had fun helping to develop dynamic navigation systems for automobiles, it wasn’t exactly rock-n-roll. With Philips Electronics (my employer at the time) deciding to divest itself of much of its in-car electronics and mobile radio activities in the early 1990s, the projects I was working on started to wind down.
By sheer good fortune, one of my former project leaders, Jan Biesterbos, was assigned to the same group as myself within Philips’ main Research Laboratories in Eindhoven, The Netherlands. As an optical disc technologist, Jan had left my navigation project to manage the development of what became the very first DVD player. He had then been asked to manage the development of a recording system for a new, unnamed audio project that Philips and Sony were collaborating on.
I can remember our conversation like it happened yesterday. Jan had stopped to ask me how things were going. After explaining that Philips was divesting itself of several business units, and so things weren’t looking too rosy for my own work, I asked what he was up to. He explained that Philips/Sony were starting the development of a new audio format that would improve on the highly successful Compact Disc (CD), and that the project would draw on specialists in many areas: disc manufacturing, player design, IC design, error correction coding, and lossless audio compression. It sounded wonderful. Then came those magical words: “Isn’t this something you would be interested in doing?”
Jan had remembered how mad I was about music and all things hi-fi. “Uh, yes,” I stuttered, “if you think that there is something for me to do, I’d love to take part.” Jan told me that he would need to discuss my participation with the overall project management, and would get back to me. After a wait that seemed like an eternity, but was probably just a few days, Jan informed me that I could join the project! I was going to enter the world of rock-n-roll after all!!!!
More of everything
Our job was to develop and build a Direct Stream Digital (DSD) recording system, and then convince studios, artists and labels of how wonderful it was. At the same time, all the specifications for what eventually became Super Audio CD (SA-CD) were being hammered out by the many specialists within Philips and Sony. On January 1st, 1996, I joined the project full-time, having worked part-time on it for the preceding few months. My job was to help install a reference listening system within Philips, track the development of the recorder, test it, assist in its use in the field, and promote both it and the format in general to labels, studios, recording/mastering engineers and consumers (though that would come much later).
The controversial part of the SA-CD specifications was the choice of DSD as the audio encoding format. Philips had stirred the pot in digital audio when it released the first Bitstream™ CD players in the mid 1980s. Bitstream™ was based on high-speed sigma-delta modulation; however, the transport medium, CD in this case, used Pulse-Code Modulation (PCM). Sigma-delta modulation was used in analogue-to-digital and digital-to-analogue converters to overcome many practical problems of implementing accurate PCM converters. The logic behind the choice of DSD was quite simple: remove the intermediate translations to PCM, and record and process the high-speed data stream as is. Without the extra conversion steps, surely the audio will sound better?
At that time, some people within the project felt that a high-resolution stereo disc based on DSD would be sufficient to engage the interest of music consumers the World over. Improved sound quality and compatibility back to CD via the hybrid disc concept were felt to be sufficient inducements to get buyers interested in the format. Many in the project, however, disagreed. Multi-channel audio was felt to be crucial to the next revolution in home audio in the way that stereo had overtaken mono in the 1960s. Since we were still a long way away from actually launching SA-CD at that point, Philips elected to make its recording system multi-channel ready.
The design, implementation and testing of the planned recorder took well over a year. The chief designer in the Philips team was the legendary Carel Dijkmans, a brilliant engineer, and the holder of over 50 US patents. Carel had been one of the developers of the Bitstream™ concept. His brief was to make the best sounding analogue-to-digital-to-analogue conversion system he could. That meant no switched-mode power supplies, no metal-film resistors (carbon only in the audio signal path), telecommunication-grade optical fibers to minimize interference and to avoid ground loops, multiple digital-to-analogue converters (DACs) in differential mode for highest signal/noise ratios, etc., etc.
While the design and construction was underway, I went in search of people to help evaluate how good (or bad) the system was. At that time, Philips still owned PolyGram, and so getting access to PolyGram’s recording people was relatively easy. In this way, I found myself one day at the Philips Classics Recording Centre (now Polyhymnia) in Baarn, The Netherlands. It was there that I met Erdo Groot for the first time. Erdo is a classical music balance engineer (or tonmeister), and had focused on multi-channel recording while studying at the Department of Music and Sound Recording at Guildford University in England. Like all recording engineers, he had a very strong interest in anything that would improve fidelity, but was almost unique in firmly believing that the addition of multi-channel surround would be as important in adding to the realism of a recorded event – particularly in the case of classical music – as the improvement in recording quality.
Like many of his colleagues, Erdo’s goal is to get the right balance at the actual recording session, so he records his stereo mix directly. However, many factors can mitigate against using the “live” mix for a CD, so, for safety, he also records his microphone signals to a 24 track recorder to allow any after-the-event fixes to be carried out. As a result, he had a number of multi track recordings that could be re-mixed to multi-channel to show both the labels and the management in Philips what multi-channel could bring to the table.
Several weeks later, Erdo visited us in Eindhoven with a handful of Tascam DA-88 tapes. While only 16-bit/44.1 kHz PCM, these tapes were a revelation. Erdo had re-mixed the following titles:
Berlioz: Messe Solennelle – John Eliot Gardiner/Orchestre Revolutionnaire et Romantique/Monteverdi Choir
Britten: Symphony for Cello/Walton: Concerto for Cello – Neville Marriner/Academy of St. Martin-in-the-Fields/Julian Lloyd-Webber
Finzi: Clarinet Concerto/Romance/Nocturne/Dies Natalis – Neville Marriner/Academy of St. Martin-in-the-Fields/Ian Bostridge
Verdi: Oberto – Neville Marriner/Academy of St. Martin-in-the-Fields/London Voices
Philips Research was equipped with a dedicated, tuned, floating listening room. We could listen to music all day, at whatever level, without disturbing anyone. The first tape we assessed was the Berlioz. This was a live recording from the hugely reverberant Westminster Cathedral. We immediately found some “problems”. What the hell was that buzzing sound, and why had he dubbed the intro. from Pink Floyd’s Welcome To The Machine on the quiet bits? It turned out that this was the premiere performance of a “lost” Berlioz work. Given its significance, the piece was not only being recorded but filmed too. The buzzing came from all the arc lights used to illuminate the cathedral for the benefit of the cameras. As for the machine drone, that was the diesel generators outside that provided the power. There was no way to completely close the door, so there was leakage, picked up on the left rear microphone (fortunately, Erdo was already mic’ing for multi-channel). All these things are present in the stereo mix, but were more obvious in multi-channel.
Moving onto the Verdi, the first thing we were aware of was the heightened theatricality. Locating the voices moving around the (seemingly larger) stage was much more natural in multi-channel, and hearing off stage elements appear rock-solid in the image from 90 degrees to center caught many people by surprise. Multi-channel also seemed to give the orchestra and singers more room to “breathe” compared to the already excellent stereo version.
The most interesting piece, however, was the Walton Concerto for Cello. Here Erdo had created two mixes. The first could be considered a standard concert presentation, with the cello just left of the conductor’s center position. The second mix represented how the musicians were arranged for the recording. In order to provide himself extra control, Erdo had placed the cello behind the conductor, playing towards the orchestra. In his alternative mix, Erdo had re-created this set-up, and, when you found the sweet-spot, you had the impression of a cellist playing directly behind you, with an orchestra playing in front of you.
This mix polarized people. Many hated it because they felt it was unnatural, and akin to wearing headphones while simultaneously listening to speakers. However, from a musical standpoint, it was extremely revealing, as it enabled the listener to easily follow the solo cello irrespective of how loud the orchestra played. This was our first indication that mixing for multi-channel could even be controversial in the more staid world of classical music!
Everyone who heard the mixes agreed that in terms of timbre, bass control, image depth, image width, etc., the multi-channel was far superior to the stereo (also mixed by Erdo). Not to offend the diehard stereo purists who insist that their 2-channel systems rule supreme, please bear in mind that these comparisons were done on a system using five identical, bi-amplified B&W professional monitors. Under such conditions, there has to be something wrong with a multi-channel mix for it to sound worse than stereo.
With everyone happy with the results, it was decided that Erdo should be given the honor of being the first to use the multi-channel DSD recorder when it was ready to go!
Quit talking, and start recording!
The recorder went through testing, de-bugging, re-testing, code re-jigging, final shakedown, and was given the green light for use. The photograph below shows a part of the system. The large units on the right are the analogue-to-digital converters (ADC) [top] and DACs [bottom]. Not shown are their individual identically-sized power supply units (PSUs). Each ADC (or DAC) unit contained 9 cards: a clock splitter and 8 channels of ADC (or DAC). The DACs had dual outputs to provide two feeds to our bi-amplified monitors. Each PSU contained 9 linear power supplies: one per card. The PSUs were attached using 9 chunky cables that were just long enough to allow the units to be connected together horizontally or vertically. They all gave off considerable amounts of heat, and weighed about 100 pounds each.
Recording/replay was controlled from a standard PC. Back then, we were probably using a blazingly fast 233 MHz CPU!!! All the data was stored on hard-disk initially, and the photo shows two 7-disc SCSI RAID systems. These things were heavy, noisy, and probably contained no more than 20 GB of user storage (if that), and heaven help you if you left home without the appropriate SCSI terminator!! When the disks were full, or at the end of each recording session, data was dumped on to Digital Linear Tape (DLT) [the small unit atop the two RAID arrays]. A special interface unit, not shown, controlled the transfer of data between the PC/hard-disk system and the converters. The converters were connected to the interface box using two optical fiber bundles. The fibers allowed the noisy PC gear to be situated remotely from the ADCs/DACs. When it was all packed up, it required 14 or 15 flight-cases to hold it all.
In May of 1997, we finally found ourselves at the Colosseum in Watford, just north of London, to carry out our (and the World’s) first ever multi-channel DSD recording. The session was the recording of the complete Schumann symphonies by Sir John Eliot Gardiner and the ORR, though, since we were still evaluating everything, we would only stick around for Symphony No. 4 which was to be recorded first. Also, since this was an actual recording for release on DG’s Archiv label, our test was secondary to the CD recording (a situation that often exists to this day). As a result, there was no monitoring of the multi-channel recording. Erdo set up 5 microphones especially for us in the industry-standard ITU multi-channel configuration (0°, ±30° and ±110°), and also fed us his stereo mix from his ultra-high quality custom analogue console. Apart from trusting Erdo’s many years of experience, we would be recording blind!
Unlike current DSD recorders/editors, our first generation software had a few “features” that were less than desirable. Key among them was the fact that the system had a considerable “recovery” time following a record cycle (i.e., go into record, come out of record). Once you had stopped recording, you needed to re-set the software before you could carry on. This process took somewhere between 30 seconds and a minute (though it felt like several lifetimes when you were in a live recording session!). To compound matters, we were working with considerably smaller hard-disks than we have today. As 8-channels of DSD eats approximately 10 GB of disk space per hour, our MASSIVE disk arrays equated to about 2 hours of recording time. So, our natural tendency was to try to conserve disk space as much as possible, with the caveat that every time we stopped recording we would have to wait a while before we could start again.
After a few rehearsals, it was time to go for the first take. Everyone in record ready? Yes. Everyone in record? Yes. Slate the first take. The music starts, and within a few seconds we hear the rat-tat-tat of the conductor’s baton on his music stand. Everyone in the orchestra stops, and a big discussion starts. Erdo shouts, “Keep rolling.” More discussion. “Keep rolling.” More discussion. We start to get fidgety. Doesn’t the maestro realize our predicament? We shoot anxious glances at Erdo. “You can probably stop, if you like.” We hit stop. Almost immediately the discussions stop, and it’s time to go again. Except, our computer is still thinking about it. Slowly. V-e-r-y s-l-o-w-l-y. There’s nothing we can do about it. The orchestra starts without us. Bugger!
They don’t get too far, however, before it’s time to talk it over again. Bar numbers are mentioned. Players are asked to play louder, softer, faster, slower. By the time the instructions are given out, we’re ready to go again. Great! This time they get through the entire first movement without stopping. Excellent! We hit stop. They decide to immediately go into another take. Bugger!
This scenario was to play itself out over and over for the three days or so that it took to record the 4th Symphony. Sometimes we got lucky and got the “good” take, sometimes we didn’t. Sometimes we stopped too soon during the “false takes” and didn’t get anything. Sometimes, in our eagerness to conserve disk space, we stopped too soon and cut off the decay of the last note – easy to do when you are working without monitors! It was all a learning experience, and our methods would improve on future trips, but for now we were happy to have recorded anything.
Silence is golden
When the equipment arrived back in Eindhoven, we were all eager to hear the results. Everyone had their own ideas about what aspects of the reproduction would be improved. With everything set up, it was time for the first replay. The results were staggering. The 5 microphone technique Erdo had used gave amazing spatial imaging. One of the aspects of DSD that receives considerable attention now is its extended bandwidth. It seemed natural to assume that DSD would give immediate gains when recording instruments rich high frequencies in like cymbals. So, why did the bass sound better? And why did there seem to be more “flesh” on individual instruments?
Philips Classics Recording Centre were using dCS converters to generate their PCM data. Unlike many studios, they had enough dCS gear to cover all 24-tracks of their multi-track recorder. For comparative purposes, we also recorded 24-bit/44.1 kHz dCS versions of the stereo and multi-channel mixes to act as our CD reference.
The dCS PCM converters were, and still are, considered among the very best in the World, yet switching between the “SA-CD” and “CD” versions showed the clear improvements that DSD could bring to recording. The rasp on strings was better. The ORR uses period instruments and the thwacks of the small timpani sounded like wood hitting skin. There was an ease to the sound that was missing in the “CD” version. The “CD” image seemed to be two-dimensional in comparison (even when listened to in multi-channel). The start of the 4th Movement begins with a quiet string section that is joined by the horns as the level begins to build. It was this 90 second section that demonstrated one of the biggest benefits of high-resolution recording: the sound of horns. If there is one thing that CD cannot do, it is reproduce a horn playing loudly. As the sound begins to expand from the bell, something seems to go wrong and it sounds like the notes are being pinched or squeezed. The DSD version allowed the notes to build naturally. We were all dumbfounded.
Still, the most telling part of these test recordings was listening to the silences, the pauses between recordings. Anyone who has stood in an empty concert hall has heard the “room-tone”, the sound of an empty room. Listening to this in multi-channel DSD was a revelation. For me, it was the thing that really made you feel like you were back in the Colosseum. Bizarrely, even the room-tone sounded different between the “SA-CD” and “CD” versions!
So, our little test had shown that there were improvements to be had in going from 2-channel CD to 2-channel SA-CD to multi-channel SA-CD. We were on to something! Our first recording was in the bag. It wasn’t exactly rock-n-roll, but I didn’t care. I was hooked!
Next: It’s time to go public