[VB.NET] .wav File Format

Got questions? Got answers? Go here for both.

Moderator: MaxCoderz Staff

Post Reply
User avatar
thegamefreak0134
Extreme Poster
Posts: 455
Joined: Mon 23 Jan, 2006 10:09 pm
Location: In front of a Computer, coding
Contact:

[VB.NET] .wav File Format

Post by thegamefreak0134 »

OK, I'll come out and say it. I want to write a software mixer, with the ability to play with the sounds in many different ways. This is for many reasons, but mainly to try to create a similar thing on both the GBA and maybe the calc. Who knows...

Anywho, I am using my usual VB.NET. Is there an easy way to read the individual bytes of a wav file without swimming through the fila byte by byte myself? I know that aside from sample frequency, each sample is represented by either one or two bytes, depending on 8-bit or 16-bit respectively. These bytes, due to the nature of a sound wave, it seems must be signed bytes.

Basically, I need to read one byte of the wave at a time and do something nice and fun with it. (I want to create an algorithm that can actually stretch a wave out properly. I know how to do it, it's just a matter or achieving it.)

Does VB even allow it, or do I have to go swimming into the depths? Thanks for any help.

-gamefreak
I'm not mad, just a little crazy.

DarkNova - a little side project I run.
King Harold
Calc King
Posts: 1513
Joined: Sat 05 Aug, 2006 7:22 am

Post by King Harold »

There HAS to be a way, but if I were you i would use C++ or C#, I THINK they make byte level operations easier, but i might aswell be wrong.
I think i remember something about offsets and it sounds like that's what you want, is that right?
coelurus
Calc Wizard
Posts: 585
Joined: Sun 19 Dec, 2004 9:02 pm
Location: Sweden
Contact:

Post by coelurus »

I'm not a VB enthusiast, Ben or kv83 probably knows more about this. Wouldn't say you should switch (although I would :wink:).

Anyway, you can either read sample per sample (which you try to avoid it seems, look up BinaryReader or similar on MSDN) or dump the whole pcm-thingie (the whole sample-kablooey, BinaryReader again!) into a buffer and use a proper type (8/16) to access the elements.

Or you could look for a lib to load the wav for you.
User avatar
benryves
Maxcoderz Staff
Posts: 3087
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

First thing first - writing a WAV file decoder isn't an easy, unless you fancy hooking into something like DirectShow to handle it for you. WAV files can use different compression schemes for the audio data - a WAV file could store the sound using MP3, for example. I assume you don't want to also have to write your own codec for every audio compression scheme under the sun, so manually loading the WAV isn't a great idea.

That said, wotsit.org is a great site for looking up file formats. As coelurus said, the BinaryReader class is the best way to load binary data. C++ or C# don't make byte-level operations any easier ;)

I think what you should look for is a sound library that lets you perform low-level operations. I really like FMOD Ex (free for non-commercial use). All you'd need to do is set up a delegate (callback) that will be called whenever you need to fill the sound buffer. It passes a pointer to the sound buffer and the amount of data you need to provide, and you just fill it with bytes. Couldn't be easier.

There is, however, one big problem though. VB.NET has one annoying limitation - no unsafe pointer support. To fill the (unmanaged) sound buffer you need to use an unsafe pointer. You will need to use C# for that.

You could write a simple library that handles the basics for you in C#, compile to a class library and use that from VB.NET.
User avatar
benryves
Maxcoderz Staff
Posts: 3087
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

Turns out I'm full of rubbish (and haven't used FMOD Ex recently). It doesn't pass an unmanaged pointer, it passes an IntPtr. You can just fill an array up with the data you need then use Marshal.Copy to dump the data across into the sound buffer (it works, I just went and removed all the unsafe code blocks from an old FMOD Ex project and it's still cheerful).

There's still the problem that the FMOD-provided wrapper is just a bunch of C# source files I'll have a go at compiling these into a class library for you.
User avatar
benryves
Maxcoderz Staff
Posts: 3087
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

Three in a row - oh dear, but here we go. :)

Right, so Microsoft have made it really difficult to work with ;) (no, not really, it's a doddle).

First of all, you need to communicate with the FMOD Ex library. Typically this is done via the unmanaged fmodex.dll. You need to use P/Invoke ("Platform Invoke") to talk to this DLL from managed (.NET) code. You can either do this yourself (writing all the definitions out) for each function or use their provided C# source files (fmod.cs, fmod_dsp.cs and fmod_errors.cs).

You're not using C#. This isn't a problem. .NET is language agnostic, after all! However, you need to compile that C# code somehow, so do this: create a new VC# project. Set it to build as a class library. Now, remove the source file it added by default (Program.cs) and add the FMOD Ex C# wrapper source files listed above. Build the project (make sure to build in Release mode). It'll spit out a DLL - this is the class library. You can close VC# now, no more need to use it. :)

Go back into VB.NET, and go Project->Add Reference. Browse to the DLL you just created with VC#. Hit OK, and that's it - you can now use FMOD Ex from VB.NET (or, indeed, any other .NET-friendly language). Wonderful stuff.

I have created two projects - a VC# one for the class library and a VB.NET one for a primitive demo. If you're feeling super-lazy, just go into the bin\Release folder of the class library project and grab "fmodex_classlib.dll".

Unfortunately, the authors of FMOD Ex decided to be rebels and not use standard XML documentation comments. This means you won't get help in Intellisense, you'll only get the function names and argument types. The FMOD Ex documentation away from the source is good, though, so look in there.

Anyhow, the VB.NET project is fairly simple. It just sets up FMOD Ex, creates a 'sound' and plays it on a channel. The sound buffer is filled manually thanks to a delegate (callback) "PcmReadCallbackFunction". It uses Marshal.Copy (which you should be great friends with by now!) to dump the data from an array into the sound buffer.

Here is the source code for the demo application.

Grab the projects from here. Note that to get them to actually run you'll need to stick fmodex.dll (you can find it once you've installed FMOD Ex itself) in the working directory. You'll also need to put fmodex_classlib.dll alongside it if you wish to run the VB demo.
User avatar
thegamefreak0134
Extreme Poster
Posts: 455
Joined: Mon 23 Jan, 2006 10:09 pm
Location: In front of a Computer, coding
Contact:

Post by thegamefreak0134 »

OK, playing the actual wav isn't the problem. You're right, I can use direct show for something like that. I actually need the wav data (raw data) in byte form so I can "play" with it. This doesn't need to be a commercial style operation, I'm OK with having to record the sound in a certain compression style, that's fine and dandy. This is for my own purposes.

I have the wav file specifications, and it's basically a matter of skip this part (read these bits for info) and here is your data. If this is true, I only have one question. Is PCM compressed at all? Or is is raw data? Raw data is what I really need to work with for my routine to work, I think.

-gamefreak
I'm not mad, just a little crazy.

DarkNova - a little side project I run.
User avatar
benryves
Maxcoderz Staff
Posts: 3087
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

The PCM data probably isn't compressed, if you are only dealing with plain PCM files. However, the data in WAV files are also compressed using a variety of different compression schemes - see this list.

FMOD Ex can be used to load and decode a WAV file for you (amongst many other formats). I haven't worked with extracting the raw samples, chances are you might be able to do that once you have created a Sound from a file -
WAV - (Microsoft Wave files, inlcluding [sic] compressed wavs. PCM, MP3 and IMA ADPCM compressed wav files are supported across all platforms in FMOD Ex, and other compression formats are supported via windows codecs on that platform).
I'll have a look for you, if you don't end up looking yourself. Seriously, look into FMOD Ex for sound work. It's ace. ;) (and no, I don't work for them).
User avatar
benryves
Maxcoderz Staff
Posts: 3087
Joined: Thu 16 Dec, 2004 10:06 pm
Location: Croydon, England
Contact:

Post by benryves »

Sample project that reads a WAV file and dumps out a raw binary file (containing just the PCM samples) for each channel.

The main code to do this (ignoring setting up FMOD) is just:

Code: Select all

' Try and load a WAV file:
Result = FmodSystem.createSound("sample.wav", FMOD.MODE.SOFTWARE Or FMOD.MODE.CREATESTREAM Or FMOD.MODE.OPENONLY Or FMOD.MODE.ACCURATETIME, LoadedSound)

' Note that we create it as a STREAM so that FMOD Ex doesn't close it. :)
' Also, note that ACCURATETIME is EXTREMELY important, as some files (eg MP3/VBR) without exact timing information are guessed otherwise.
' This guess is often a little wrong, so ACCURATETIME always calculates the EXACT length of the file (slower to load).

If Result <> FMOD.RESULT.OK Then
    Console.WriteLine("Couldn't load sound: {0}", Result)
    GoTo ShutDownFmod
End If

' Work out the sound format:

' These variables hold the sound info
Dim SoundType As New FMOD.SOUND_TYPE
Dim SoundFormat As New FMOD.SOUND_FORMAT
Dim SoundChannels As Integer
Dim SoundBits As Integer

Result = LoadedSound.getFormat(SoundType, SoundFormat, SoundChannels, SoundBits)
If Result <> FMOD.RESULT.OK Then
    Console.WriteLine("Couldn't get sound format: {0}", Result)
    GoTo CloseSound
End If

Console.WriteLine("Type:		{0}" & vbCrLf & "Format:		{1}" & vbCrLf & "Channels:	{2}" & vbCrLf & "Bits/sample:	{3}", SoundType, SoundFormat, SoundChannels, SoundBits)

' Let's assume you ONLY allow 16-bit samples (silly restriction, but let's assume it's there).

If SoundBits <> 16 Then
    Console.WriteLine("16-bit samples supported only.")
    GoTo CloseSound
End If

' We need to know how long the sound file is:

Dim SoundLength As UInteger
Result = LoadedSound.getLength(SoundLength, FMOD.TIMEUNIT.PCM)
If Result <> FMOD.RESULT.OK Then
    Console.WriteLine("Couldn't get sound length: {0}", Result)
    GoTo CloseSound
End If

Dim SoundByteLength As UInteger = CUInt(SoundLength * 2 * SoundChannels)

' How many samples are there, total?
Dim TotalSamples As Integer = CInt(SoundLength * SoundChannels)

' Let's copy ALL of the samples into an array

' Seek to the beginning of the sound:
Result = LoadedSound.seekData(0)
If Result <> FMOD.RESULT.OK Then
    Console.WriteLine("Couldn't seek to the start of the sound: {0}", Result)
    GoTo CloseSound
End If

' An array to hold all those blasted samples
Dim AllSamples(TotalSamples - 1) As Short

' How many bytes were actually read:
Dim ReadBytes As UInteger

' Read!
Result = LoadedSound.readData(Marshal.UnsafeAddrOfPinnedArrayElement(AllSamples, 0), SoundByteLength, ReadBytes)
If Result <> FMOD.RESULT.OK Then
    Console.WriteLine("Couldn't read the sound: {0} ({1}/{2} read)", Result, ReadBytes, SoundByteLength)
    GoTo CloseSound
End If


' Now, let's "deinterleave" those sound samples.
Dim PlainSamples(SoundChannels - 1, CInt(SoundLength - 1)) As Short

Dim Ptr As Integer = 0
For Sample As Integer = 0 To CInt(SoundLength - 1)
    For Channel As Integer = 0 To SoundChannels - 1
        PlainSamples(Channel, Sample) = AllSamples(Ptr)
        Ptr += 1
    Next
Next
The only really new ".NET thing" in there is Marshal.UnsafeAddrOfPinnedArrayElement, which should be fairly self-explanatory (returns the unsafe address of a pinned array's element).

All it needs to do is load the sound into a Sound object, work out the format (how many channels, and we also check to make sure it's 16-bit) and then the size so we know how many samples we need to read. It's then a case of calling Sound.readData(...) to dump the samples into an array, which are then deinterleaved by channel.

EDIT: I keep referring to it as interlaced, of course it's not, it's interleaved. You know what I mean.

EDIT2: Fix...

Note that the uploaded code has one potential pitfall - the Marshal.UnsafeAddrOfPinnedArrayElement call.

The array that I pass to it has not actually been pinned. This means that if the garbage collector kicks in during the time that function is called, it can mean that memory is corrupted as the array is moved around.

So, how do you 'pin' an object? You can use a GCHandle to do this for you and GCHandle.Alloc. The code

Code: Select all

Result = LoadedSound.readData(Marshal.UnsafeAddrOfPinnedArrayElement(AllSamples, 0), SoundByteLength, ReadBytes)
would be more safely implemented as this:

Code: Select all

' Allocate a new pinned GCHandle based on the array we wish to pin.
Dim PinnedDestinationArray As GCHandle = GCHandle.Alloc(AllSamples, GCHandleType.Pinned)
' Use the unsafe address of this pinned array to read data into.
Result = LoadedSound.readData(PinnedDestinationArray.AddrOfPinnedObject, SoundByteLength, ReadBytes)
' Free the handle (let the GC do whatever it needs to do with the memory).
PinnedDestinationArray.Free()
Post Reply