smix plugin available?

Discussion:

smix plugin available?

Mark Swanson

2002-11-26 01:16:47 UTC

Hello,

A gentleman by the name of Abramo Bagnara recently stated he may have some
code that would kick-start the development of the smix plugin. I think this
is a very useful and increasingly important component. Abramo stated he had
no problem releasing it, but I could not find where or if he did so.

If Abramo - or anyone - could point me in the right direction I would
appreciate it.

The reasons I'm looking for this is to solve the following problems with
soundcards that do not have a hardware mixer:

When using Gnome/KDE (esd/arts) no programs can access the sound device unless
they are written to use arts/esd/jack/etc...

It is not logical for every program to write support for esd, artsd, jack,
alsa, etc. Programs should write to ALSA and let ALSA do software mixing if
required. Windows provided this since DirectX (3?). Solaris provides this too
(esd apparently doesn't block on Solaris).

I believe that the majority of sound cards in use (on Linux) do not have
hardware mixing, and that this is already a large problem that will continue
to get larger.

One of the big reasons this is affecting me is that Java sound will not work
unless you have a hardware mixer. My understanding is that the Sun folks seem
to think that it is wrong to have to implement many different ways to create
sound when the sound library (ALSA) should do it for them - the way it works
in Windows/Solaris. I completely agree with them.

Cheers.
--
Schedule your world with ScheduleWorld.com
http://www.ScheduleWorld.com/

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Paul Davis

2002-11-26 03:19:05 UTC

Post by Mark Swanson
It is not logical for every program to write support for esd, artsd, jack,
alsa, etc. Programs should write to ALSA and let ALSA do software mixing if
required. Windows provided this since DirectX (3?). Solaris provides this too
(esd apparently doesn't block on Solaris).

Windows provides this, true. But most "prosumer" and professional
applications do *not* use it. The Windows kernel mixer was the curse
of pro-ish apps for a long time, hence driver models like ASIO and now
WDM. AFAIK, the new WDM drivers are generally used by apps in ways
that preclude "sharing" with other apps, and certainly ASIO drivers
cannot be shared in this way.

Post by Mark Swanson
One of the big reasons this is affecting me is that Java sound will not work
unless you have a hardware mixer. My understanding is that the Sun folks seem
to think that it is wrong to have to implement many different ways to create
sound when the sound library (ALSA) should do it for them - the way it works
in Windows/Solaris. I completely agree with them.

ALSA is *a* sound library. There are lots of things that it doesn't
contain, and its written around a fairly specific programming
paradigm. There are those of us (many people on LAD) who believe that
its too hard to fit a callback-driven model into the existing ALSA
design, and that its therefore better to implement such a model
outside of ALSA.

You see, if all apps are written to use the ALSA API, that's going to
be great for the purposes you have in mind, but totally awful for
those of us who want our audio apps to work in a sample synchronous
way and ignorant of the ultimate routing of their data. Many of us
don't think that an API based on the open/close/read/write paradigm is
appropriate for real time streaming media.

All that being said, I'd love to see the smix plugin implemented and
available, if only because it would allow ALSA native apps to
participate in a JACK system, albeit without sample synchronous
behaviour.

--p

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Mark Swanson

2002-11-26 03:54:46 UTC

Post by Paul Davis

Post by Mark Swanson
It is not logical for every program to write support for esd, artsd, jack,
alsa, etc. Programs should write to ALSA and let ALSA do software mixing
if required. Windows provided this since DirectX (3?). Solaris provides
this too (esd apparently doesn't block on Solaris).

Windows provides this, true. But most "prosumer" and professional
applications do *not* use it. The Windows kernel mixer was the curse
of pro-ish apps for a long time, hence driver models like ASIO and now
WDM. AFAIK, the new WDM drivers are generally used by apps in ways
that preclude "sharing" with other apps, and certainly ASIO drivers
cannot be shared in this way.

For "prosumer" and professional applications, lock it and block it. I have no
idea how bad it is for miscellaneous simultaneous use of the sound device wrt
interfering with these types of apps. But, if it is a problem - and I believe
you that it is - then lock away. I don't think anyone would mind that. I
don't think it interferes with what people would use smix for.

Post by Paul Davis

Post by Mark Swanson
One of the big reasons this is affecting me is that Java sound will not
work unless you have a hardware mixer. My understanding is that the Sun
folks seem to think that it is wrong to have to implement many different
ways to create sound when the sound library (ALSA) should do it for them
- the way it works in Windows/Solaris. I completely agree with them.

ALSA is *a* sound library. There are lots of things that it doesn't
contain, and its written around a fairly specific programming
paradigm. There are those of us (many people on LAD) who believe that
its too hard to fit a callback-driven model into the existing ALSA
design, and that its therefore better to implement such a model
outside of ALSA.
You see, if all apps are written to use the ALSA API, that's going to
be great for the purposes you have in mind, but totally awful for
those of us who want our audio apps to work in a sample synchronous
way and ignorant of the ultimate routing of their data. Many of us
don't think that an API based on the open/close/read/write paradigm is
appropriate for real time streaming media.

If there was a way to temporarily disable the smix plugin, or temporarily gain
exclusive ownership of the sound device for your purposes would that meet
100% of your requirements?

Post by Paul Davis
All that being said, I'd love to see the smix plugin implemented and
available, if only because it would allow ALSA native apps to
participate in a JACK system, albeit without sample synchronous
behaviour.

Great.
Know where to find it? :-)
Even totally broken code that does not compile or do anything useful would be
a wonderful head-start relative to starting from scratch.

Cheers.
--
Schedule your world with ScheduleWorld.com
http://www.ScheduleWorld.com/

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Paul Davis

2002-11-26 04:13:35 UTC

Post by Mark Swanson

Post by Paul Davis
You see, if all apps are written to use the ALSA API, that's going to
be great for the purposes you have in mind, but totally awful for
those of us who want our audio apps to work in a sample synchronous
way and ignorant of the ultimate routing of their data. Many of us
don't think that an API based on the open/close/read/write paradigm is
appropriate for real time streaming media.

If there was a way to temporarily disable the smix plugin, or temporarily gain
exclusive ownership of the sound device for your purposes would that meet
100% of your requirements?

no, it wouldn't meet any of them. the problem is not exclusive
access. its the fundamental API model. ALSA (like OSS before it, as
well as SGI's DMedia APIs) has promoted the open/close/read/write
model. this is the central problem. ALSA certainly *allows* for a
callback model (its what allows JACK to work), but there are almost no
applications that use ALSA in this way. using the o/c/r/w paradigm
makes sample synchronous execution of multiple applications basically
impossible, and more importantly it encourages application designers
to construct programs based on the idea that the program controls when
to read/write audio data. this doesn't work properly except for
heavily buffered, single applications. the APIs that are used to write
almost all audio software code in production these days all use a
callback model. porting from the o/c/r/w model to the callback one is
hard. do you want another generation of apps stuck with this problem?

if you want a genuinely portable solution, use PortAudio. it works
with (but hides) OSS, Windows MME, ASIO, CoreAudio, and several
others. JACK and ALSA support is present in CVS. it encourages a
callback model.

Post by Mark Swanson

Post by Paul Davis
All that being said, I'd love to see the smix plugin implemented and
available, if only because it would allow ALSA native apps to
participate in a JACK system, albeit without sample synchronous
behaviour.

Great.
Know where to find it? :-)

i think abramo posted a copy of what he had to this list within the
last 5 weeks. unfortunately, sf.net's archives don't allow searching,
so finding it might be tedious, to put it mildly.

--p

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Frans Ketelaars

2002-11-26 08:13:59 UTC

On Mon, 25 Nov 2002 23:13:35 -0500
Paul Davis <***@linuxaudiosystems.com> wrote:

<snip>

Post by Paul Davis

Post by Mark Swanson

Post by Paul Davis
All that being said, I'd love to see the smix plugin implemented and
available, if only because it would allow ALSA native apps to
participate in a JACK system, albeit without sample synchronous
behaviour.

Great.
Know where to find it? :-)

i think abramo posted a copy of what he had to this list within the
last 5 weeks. unfortunately, sf.net's archives don't allow searching,
so finding it might be tedious, to put it mildly.
--p

http://www.mail-archive.com/alsa-***@lists.sourceforge.net/msg04592.html

HTH,

-Frans

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Mark Swanson

2002-11-26 12:33:09 UTC

Thanks!
I had seen this posting but the particular web interface I was using didn't
show the attachment. Got it.
--
Schedule your world with ScheduleWorld.com
http://www.ScheduleWorld.com/

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Florian Bomers

2002-11-26 19:32:40 UTC

Hi Paul,

very interesting discussion.

Post by Paul Davis
(...)

Post by Mark Swanson
One of the big reasons this is affecting me is that Java sound will not work
unless you have a hardware mixer. My understanding is that the Sun folks seem
to think that it is wrong to have to implement many different ways to create
sound when the sound library (ALSA) should do it for them - the way it works
in Windows/Solaris. I completely agree with them.

ALSA is *a* sound library. There are lots of things that it doesn't

I would really say: ALSA is *the* sound library (at least on Linux). Isn't it in kernel
2.5+ ?

Post by Paul Davis
contain, and its written around a fairly specific programming
paradigm. There are those of us (many people on LAD) who believe that
its too hard to fit a callback-driven model into the existing ALSA
design, and that its therefore better to implement such a model
outside of ALSA.
You see, if all apps are written to use the ALSA API, that's going to
be great for the purposes you have in mind, but totally awful for
those of us who want our audio apps to work in a sample synchronous
way and ignorant of the ultimate routing of their data. Many of us
don't think that an API based on the open/close/read/write paradigm is
appropriate for real time streaming media.

I don't know if I get the point right here, but it reads: don't use ALSA. But it's the
only sound library that will be delivered with all distributions in future, so what choice
do mainstream application writers have ? OSS ?

Post by Paul Davis
(...)

Post by Mark Swanson
If there was a way to temporarily disable the smix plugin, or temporarily gain
exclusive ownership of the sound device for your purposes would that meet
100% of your requirements?

no, it wouldn't meet any of them. the problem is not exclusive
access. its the fundamental API model. ALSA (like OSS before it, as
well as SGI's DMedia APIs) has promoted the open/close/read/write
model. this is the central problem. ALSA certainly *allows* for a
callback model (its what allows JACK to work), but there are almost no
applications that use ALSA in this way. using the o/c/r/w paradigm
makes sample synchronous execution of multiple applications basically
impossible, and more importantly it encourages application designers
to construct programs based on the idea that the program controls when
to read/write audio data. this doesn't work properly except for
heavily buffered, single applications.

So you discourage use of ALSA because it suggests/allows a non-professional programming
paradigm?

Post by Paul Davis
the APIs that are used to write almost all audio software code in
production these days all use a callback model.

Sorry for questioning this statement. Of course we all don't have any statistical data but
you miss what I see as the majority of applications that use audio devices:

1) games
2) media players
3) GUI sounds (i.e. accessibility)

What point is in comparing the amount of "audio software code" ? Compare the number of
people using the above types of software with the number of people using semi-pro level
audio software.

Post by Paul Davis
porting from the o/c/r/w model to the callback one is
hard. do you want another generation of apps stuck with this problem?

So you think the solution is to lead developers to the right programming paradigm by not
making o/c/r/w type of APIs available anymore.

Again, I don't see the point. The 99% of Linux users using the above types of programs do
not care about the programming paradigm. They want to hear their apps. Since Linux
distributions enable artsd/esd sound daemons by default, people don't hear applications
that don't support the specific sound daemon. On Windows, we do have the choice and it all
happily coexists.

My perfect world would look like this:
- ALSA (becoming the default audio HAL on Linux) has the future smix plugin enabled by
default, but only if the soundcard does not provide hardware mixing
- sound daemons can all run at the same time, and they can continue to block the device if
they really think that's a good idea
- apps with higher requirements (low latency, sample synchronization, etc.) will need the
user to stop the daemons and will use ALSA hardware devices hw: directly.

I also see the problem that audio daemons block the soundcard. But that's another story.

Post by Paul Davis
if you want a genuinely portable solution, use PortAudio. it works
with (but hides) OSS, Windows MME, ASIO, CoreAudio, and several
others. JACK and ALSA support is present in CVS. it encourages a
callback model.

The "genuinely" portable solution is Java :)

Well, my point is that I want mainstream apps work out-of-the box, no matter if they talk
to a daemon, to ALSA or use OSS. For semi-pro audio apps, a little effort to make them use
the full capabilities is necessary anyway, and those people probably won't mind stopping a
sound daemon.

Florian
--
Florian Bomers
Java Sound
Java Software/Sun Microsystems, Inc.
http://java.sun.com/products/java-media/sound/

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Paul Davis

2002-11-26 23:11:18 UTC

Post by Florian Bomers

Post by Paul Davis
ALSA is *a* sound library. There are lots of things that it doesn't

I would really say: ALSA is *the* sound library (at least on Linux). Isn't it in kernel
2.5+ ?

alsa-lib isn't in kernel 2.5, because its not part of the kernel.

alsa-lib doesn't contain any code to read or write audio files, for
example. it contains no code to do dithering. it contains no code to
do audio rate resampling, at least no code that is accessible by a
regular program.

Post by Florian Bomers
I don't know if I get the point right here, but it reads: don't use
ALSA. But it's the only sound library that will be delivered with all
distributions in future, so what choice do mainstream application
writers have ? OSS ?

PortAudio would be a much better choice.

Post by Florian Bomers
So you discourage use of ALSA because it suggests/allows a
non-professional programming paradigm?

not just non-professional. the read/write paradigm isn't just
non-professional, its wrong. its masking a fundamental aspect of the
hardware. it works for video (which is also, essentially a real time
streaming media device) because nobody notices anything odd about a
video screen that doesn't change for a little while (unless they are
trying to watch a real-time video stream). but any audio device that
outputs the same data twice around its hardware buffer is immediately
noticeable by any (non-hearing impaired) user. you can't fake
this. apps that use large buffers are just getting away with
pretending; it may work for them, but it doesn't work for most apps. i
find the lag on the xmms eq controls, for example, quite irritating.

Post by Florian Bomers

Post by Paul Davis
the APIs that are used to write almost all audio software code in
production these days all use a callback model.

Sorry for questioning this statement. Of course we all don't have any statisti
cal data but
1) games
2) media players
3) GUI sounds (i.e. accessibility)

this is a fair point. i apologize. i have my head so far inside the
"audio software for musicians" box that i tend to fail to see such
applications :)

however, the very fact that the applications developers of such
programs really don't know much (if anything) about audio programming,
and probably don't want to know much about it either, suggests that an
API like ALSA which exposes the full HAL is probably a mistake. again,
i consider PortAudio a vastly preferable solution.

i would note in passing that your second class of apps might very well
be something that needs to be integrated with prosumer/pro apps. the
division between, say, alsaplayer and gdam (to give two examples from
the linux world) is hard to find when thinking up reasons why one
should never be useful as a participant in something like JACK, and
another one should.

Post by Florian Bomers

Post by Paul Davis
porting from the o/c/r/w model to the callback one is
hard. do you want another generation of apps stuck with this problem?

So you think the solution is to lead developers to the right programming parad
igm by not making o/c/r/w type of APIs available anymore.

pretty much, yes.

Post by Florian Bomers
Again, I don't see the point. The 99% of Linux users using the above
types of programs do not care about the programming paradigm. They
want to hear their apps. Since L inux distributions enable artsd/esd
sound daemons by default, people don't hear app lications that don't
support the specific sound daemon. On Windows, we do have the choic e
and it all happily coexists.

you're talking about a world in which there is a sharp divide between
the classes of apps you've listed above and the prosumer/pro apps i
tend to focus on. i'd rather see linux support audio APIs that
provide, if not an integrated user environment for all apps (this may
not be possible), then an integrated and consistent audio API, one
that recognizes the inherently real-time nature of audio software and
uses a callback-driven model. can you say "CoreAudio"? :))

Post by Florian Bomers
Well, my point is that I want mainstream apps work out-of-the box, no
matter i f they talk to a daemon, to ALSA or use OSS. For semi-pro
audio apps, a little effort to m ake them use the full capabilities
is necessary anyway, and those people probably won't mind stopping a
sound daemon.

Apple have done the right thing, IMHO. OS X forces developers to
either (1) deal with a full featured and complex HAL, but in so doing
lose the ability to interact with other software, which users won't
like or (2) use the AudioUnits API (callback based) which allows full
integration. The problem is, as I noted several times before, there is
nobody in the linux world who can force this on developers.

--p

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Florian Bomers

2002-11-27 00:54:02 UTC

That makes sense. I also fully agree that callback-driven APIs are better suited for
audio. On the other hand, nobody would doubt that o/c/r/w apps would not allow low enough
latency for GUI-type apps like EQs (e.g. with buffer sizes of 10-20ms the total latency
isn't so inacceptable).

But that's all getting a bit too far off of the original problem: sound on Linux is not
playing, because another app or daemon blocks the sound device. For that particular
problem, PortAudio is merely a work-around than a solution.

I think such an smix plugin would work like this (please correct if wrong):

1) standard "mainstream" users will use smix. Their software will need to use the
"plughw:" devices in ALSA; smix is enabled by default (OSS programs, ALSA programs, sound
daemons, ...).

2) more professional audio apps use "hw:" devices and fail if the smix plugin is being
used. The user needs to make sure that all software that uses the "plughw:" devices are
terminated and then start the audio app again.

3) since smix shouldn't be activated for devices with hardware mixing, those users get the
best from both worlds. Needless to say that people who use semi-pro audio software are not
so likely to have only one non-mixing soundcard.

JACK is really in between, but maybe some users would rather use a "crippled" JACK working
on "plughw:" than having no OSS/artsd/etc. sound simultaneously.

Florian

PS: sorry for my ignorance, but I haven't said "CoreAudio" yet, I guess I should learn to
do so...

Post by Paul Davis

Post by Florian Bomers

Post by Paul Davis
ALSA is *a* sound library. There are lots of things that it doesn't

I would really say: ALSA is *the* sound library (at least on Linux). Isn't it in kernel
2.5+ ?

alsa-lib isn't in kernel 2.5, because its not part of the kernel.
alsa-lib doesn't contain any code to read or write audio files, for
example. it contains no code to do dithering. it contains no code to
do audio rate resampling, at least no code that is accessible by a
regular program.

Post by Florian Bomers
I don't know if I get the point right here, but it reads: don't use
ALSA. But it's the only sound library that will be delivered with all
distributions in future, so what choice do mainstream application
writers have ? OSS ?

PortAudio would be a much better choice.

Post by Florian Bomers
So you discourage use of ALSA because it suggests/allows a
non-professional programming paradigm?

not just non-professional. the read/write paradigm isn't just
non-professional, its wrong. its masking a fundamental aspect of the
hardware. it works for video (which is also, essentially a real time
streaming media device) because nobody notices anything odd about a
video screen that doesn't change for a little while (unless they are
trying to watch a real-time video stream). but any audio device that
outputs the same data twice around its hardware buffer is immediately
noticeable by any (non-hearing impaired) user. you can't fake
this. apps that use large buffers are just getting away with
pretending; it may work for them, but it doesn't work for most apps. i
find the lag on the xmms eq controls, for example, quite irritating.

Post by Florian Bomers

Post by Paul Davis
the APIs that are used to write almost all audio software code in
production these days all use a callback model.

Sorry for questioning this statement. Of course we all don't have any statisti
cal data but
1) games
2) media players
3) GUI sounds (i.e. accessibility)

this is a fair point. i apologize. i have my head so far inside the
"audio software for musicians" box that i tend to fail to see such
applications :)
however, the very fact that the applications developers of such
programs really don't know much (if anything) about audio programming,
and probably don't want to know much about it either, suggests that an
API like ALSA which exposes the full HAL is probably a mistake. again,
i consider PortAudio a vastly preferable solution.
i would note in passing that your second class of apps might very well
be something that needs to be integrated with prosumer/pro apps. the
division between, say, alsaplayer and gdam (to give two examples from
the linux world) is hard to find when thinking up reasons why one
should never be useful as a participant in something like JACK, and
another one should.

Post by Florian Bomers

Post by Paul Davis
porting from the o/c/r/w model to the callback one is
hard. do you want another generation of apps stuck with this problem?

So you think the solution is to lead developers to the right programming parad
igm by not making o/c/r/w type of APIs available anymore.

pretty much, yes.

Post by Florian Bomers
Again, I don't see the point. The 99% of Linux users using the above
types of programs do not care about the programming paradigm. They
want to hear their apps. Since L inux distributions enable artsd/esd
sound daemons by default, people don't hear app lications that don't
support the specific sound daemon. On Windows, we do have the choic e
and it all happily coexists.

you're talking about a world in which there is a sharp divide between
the classes of apps you've listed above and the prosumer/pro apps i
tend to focus on. i'd rather see linux support audio APIs that
provide, if not an integrated user environment for all apps (this may
not be possible), then an integrated and consistent audio API, one
that recognizes the inherently real-time nature of audio software and
uses a callback-driven model. can you say "CoreAudio"? :))

Post by Florian Bomers
Well, my point is that I want mainstream apps work out-of-the box, no
matter i f they talk to a daemon, to ALSA or use OSS. For semi-pro
audio apps, a little effort to m ake them use the full capabilities
is necessary anyway, and those people probably won't mind stopping a
sound daemon.

Apple have done the right thing, IMHO. OS X forces developers to
either (1) deal with a full featured and complex HAL, but in so doing
lose the ability to interact with other software, which users won't
like or (2) use the AudioUnits API (callback based) which allows full
integration. The problem is, as I noted several times before, there is
nobody in the linux world who can force this on developers.
--p
-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en
_______________________________________________
Alsa-devel mailing list
https://lists.sourceforge.net/lists/listinfo/alsa-devel

--
Florian Bomers
Java Sound
Java Software/Sun Microsystems, Inc.
http://java.sun.com/products/java-media/sound/

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

James Courtier-Dutton

2002-11-27 01:06:10 UTC

Post by Paul Davis

Post by Florian Bomers

Post by Paul Davis
the APIs that are used to write almost all audio software code in
production these days all use a callback model.

Sorry for questioning this statement. Of course we all don't have any statisti
cal data but
1) games
2) media players
3) GUI sounds (i.e. accessibility)

this is a fair point. i apologize. i have my head so far inside the
"audio software for musicians" box that i tend to fail to see such
applications :)
however, the very fact that the applications developers of such
programs really don't know much (if anything) about audio programming,
and probably don't want to know much about it either, suggests that an
API like ALSA which exposes the full HAL is probably a mistake. again,
i consider PortAudio a vastly preferable solution.

I would like to point out that a "callback" api would work just as well
as an open/write/close api for

1) games
2) media players
3) GUI sounds (i.e. accessibility)

I have to agree with Paul on the fact that a "callback" approach is really the ONLY real option.
Here is my reasoning: -
1) My perspective is from "(2) Media players" and not "Pro-Audio"
2) Sound Hardware tends to have very small buffers.
3) For nice sounding audio, these buffers should never run dry. I.E. No XRUNs.
4) A open/write/close api will never ever be able to guarantee no XRUNs, as it has no control on when it will get scheduling time to do the next write.
5) With a "callback" approach, the kernel would be notified by the sound hardware that it was ready for new samples, the kernel could then adjust the scheduler, so that the "callback" function was called ASAP.
The "callback" function then only has to decide which samples to give. If the "callback" function could receive a "delay" value from the sound hardware at each callback, a media player would then have all the information it needed to do full audio/video sync.
6) I don't need "sample sync", but I do NEED "callback" based api to provide me with "no XRUNs".

Summary: -
The only way to cure XRUN problems is with a "callback" based api.
All application that currently use open/write/close apis, can just as easily use a "callback" api.

Cheers
James

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Jaroslav Kysela

2002-11-27 09:29:28 UTC

Post by James Courtier-Dutton

Post by Paul Davis

Post by Florian Bomers

Post by Paul Davis
the APIs that are used to write almost all audio software code in
production these days all use a callback model.

Sorry for questioning this statement. Of course we all don't have any statisti
cal data but
1) games
2) media players
3) GUI sounds (i.e. accessibility)

this is a fair point. i apologize. i have my head so far inside the
"audio software for musicians" box that i tend to fail to see such
applications :)
however, the very fact that the applications developers of such
programs really don't know much (if anything) about audio programming,
and probably don't want to know much about it either, suggests that an
API like ALSA which exposes the full HAL is probably a mistake. again,
i consider PortAudio a vastly preferable solution.

I would like to point out that a "callback" api would work just as well
as an open/write/close api for
1) games
2) media players
3) GUI sounds (i.e. accessibility)
I have to agree with Paul on the fact that a "callback" approach is really the ONLY real option.
Here is my reasoning: -
1) My perspective is from "(2) Media players" and not "Pro-Audio"
2) Sound Hardware tends to have very small buffers.
3) For nice sounding audio, these buffers should never run dry. I.E. No XRUNs.
4) A open/write/close api will never ever be able to guarantee no XRUNs, as it has no control on when it will get scheduling time to do the next write.
5) With a "callback" approach, the kernel would be notified by the sound hardware that it was ready for new samples, the kernel could then adjust the scheduler, so that the "callback" function was called ASAP.
The "callback" function then only has to decide which samples to give. If the "callback" function could receive a "delay" value from the sound hardware at each callback, a media player would then have all the information it needed to do full audio/video sync.

Sorry, it's not as easy as you've described. It's not possible to invoke
any user code from the kernel code directly. There is a scheduler which is
informed that a task has been woken up. It depends on scheduler when the
task is really invoked. It's quite same as for the r/w model where the
application is notified over poll that something occured.

Post by James Courtier-Dutton
6) I don't need "sample sync", but I do NEED "callback" based api to provide me with "no XRUNs".

I don't think that there is some difference. If the scheduler don't give
you enough time, the audio stream is somehow broken on all architectures.

Post by James Courtier-Dutton
Summary: -
The only way to cure XRUN problems is with a "callback" based api.
All application that currently use open/write/close apis, can just as easily use a "callback" api.

Let's go and see the implementation:

The callback model is good for perfect sync between applications. It can
do (and does) chaining of more sources, arbitrating (removing invalid
sources) and so on. It is simply something "over" the audio HAL. If it
really helps, it's a different point.

The discussed difference (a few months ago) was in a count of the task
context switches.

With jack, you have these context switches (daemon and two applications
mixed together) for one period of samples (* means a context switch):

jackd -*> app1 -*> app2 -*> jackd -> soundcard

With r/w model and a sample mixing server implemented in the user space,
you can get this for one period of samples:

mserver -> soundcard
-*(ring buffer does not contain enough samples)> app1
-*(ring buffer does not contain enough samples)> app2

In real real-time setup, there will be two periods (thus app1 and app2)
will be woken up all times. So, in real world the context switches are for
one period of samples:

mserver -*> app1 -*> app2 or
mserver -*> app2 -*> app1

Note: mserver implementation uses same assumptions as jack (or any other
callback model). The ring buffers are shared between app and mserver and
the period of samples is a constant. The playback / capture pointers are
incrementing by the period size steps. Thus there is no need to commit
result directly back to the mserver. mserver will be woken up by the
kernel scheduler when next poll() event occurs (on next period boundary).

Ok, it's only simple example, that there are more solutions than Paul
suggests. I fully agree, that the callback model is suitable for the
perfect synchronization among more applications. Also, imagine that
mserver is not using a soundcard as output device, but jackd. So,
applications using r/w can use benefits of jackd (of course, there will be
one more period of samples buffered, but who will care when the start is
synchronized?).

In my brain, there is also totaly different solution with the zero context
switching overhead - sharing the soundcard DMA buffer among more
applications. There is only one problem: snd_pcm_rewind() implementation
cannot be perfect, because of wrapping added sample values (we lose
information which cannot be recovered). The question is, if it's a fatal
problem.

Let's discuss these things. Hopefully, I'll get some time to implement at
least the mserver functionality.

Jaroslav

-----
Jaroslav Kysela <***@suse.cz>
Linux Kernel Sound Maintainer
ALSA Project, SuSE Labs

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

tomasz motylewski

2002-11-27 11:36:31 UTC

Post by Jaroslav Kysela
In my brain, there is also totaly different solution with the zero context
switching overhead - sharing the soundcard DMA buffer among more
applications. There is only one problem: snd_pcm_rewind() implementation

This is my personal preference. In this model the only service ALSA has to
supply are:

1. initial configuration/start/stop.
2. mmapable DMA buffer
3. fact and precise ioctl telling the current HW pointer in the buffer. If the
card is not queried each time, then the "last period interrupt" timestamp
should be included.

Please stop the complication of "available/delay" etc. Just the raw pointer.
Each application knows where its application pointer is, so it can easily
calculate delay/available and decide for itself whether there was an overrun or
not.

The driver does not care at all whether applications are writing to this buffer
or not. There should be one thread (ISR, kernel or separate application) which
zeroes part of the buffer just after the card has played it.

Each application/thread should not just copy the data to DMA buffer, but _add_
to it - in this way we get very efficient mixing (zero copy). Of course here I
assume that "add" operations on volatile buffers are atomic vs. context
switches - probably not on all architectures - we may need mutexes.

ALSA API should also tell the applications how big the "hot zone" is (the area
where the data is first played and then set to 0).

I was able to achieve about 20 ms delay (analog-to-analog) when transmitting 10
ms RTP packets of data over the LAN with no overruns! OK, I have used HZ=2000
and low latency patches+soft RT.

I am mixing several RTP streams in this way, when 1 packet does not
arrive, the data from the other streams are there and I resynchronize just this
single stream. But the only buffering I do is in the DMA buffer of the card!
Each RTP source has its own "application ptr" and the data are added to the
buffer as they come.

Callbacks are not the most efficient there. If the data arrives 1 ms too late I
would loose the whole 10 ms period. If I still add it to the buffer, I will
loose only 1 ms and the card will play the remaining 9 ms of sound.

You will probably consider my method "ugly", "full of race conditions" etc. But
it gives mimimum sound latency.

Best regards,
--
Tomasz Motylewski

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

James Courtier-Dutton

2002-11-27 13:21:15 UTC

Post by tomasz motylewski
Please stop the complication of "available/delay" etc. Just the raw pointer.
Each application knows where its application pointer is, so it can easily
calculate delay/available and decide for itself whether there was an overrun or
not.

I use the delay() function.
I help write a multi media application that needs to sync audio to video.
So the question I ask alsa-lib is "If I write() now, how long will it be
before those samples come out of the speakers?"
So, I very much need delay(), but I also use available so that I can
program alsa when to exit the snd_pcm_wait() poll. I.E. Only exit the
snd_pcm_wait when there are enough free space in the buffer for me to
bother with a new write() call.

Cheers
James

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Paul Davis

2002-11-27 13:53:49 UTC

Post by tomasz motylewski
This is my personal preference. In this model the only service ALSA has to
1. initial configuration/start/stop.
2. mmapable DMA buffer
3. fact and precise ioctl telling the current HW pointer in the buffer. If the
card is not queried each time, then the "last period interrupt" timestamp
should be included.

ALSA supplies all 3 of these. You can use them by themselves without
the rest of the API if you want. The only problem arises with hardware
that cannot give you accurate h/w pointer positions.

Post by tomasz motylewski
Please stop the complication of "available/delay" etc. Just the raw
pointer. Each application knows where its application pointer is, so
it can easily calculate delay/available and decide for itself whether
there was an overrun or not.

actually, it can't. if the user space application is delayed for
precisely 1 buffer's worth of data, it will see the pointer at what
appears to the the right place and believe that no xrun has
occured. the only way around this is to provide either:

* h/w pointer position as a steadily incrementing value
* h/w pointer position *plus* interrupt count

i favor the latter since it provides for a longer time before wrapping
becomes an issue (ULONG_MAX interrupts).

Post by tomasz motylewski
Each application/thread should not just copy the data to DMA buffer, but _add_
to it - in this way we get very efficient mixing (zero copy). Of course here I
assume that "add" operations on volatile buffers are atomic vs. context
switches - probably not on all architectures - we may need mutexes.

they are not atomic on any architecture that i know of.

Post by tomasz motylewski
I was able to achieve about 20 ms delay (analog-to-analog) when
transmitting 10 ms RTP packets of data over the LAN with no
overruns! OK, I have used HZ=2000 and low latency patches+soft RT.

for the record, i've used this approach (inspired by someone from
poland whose name i forget) with an ISA card (no PCI burst size
issues) to achieve <10 sample processing latency. i scribbled a
special bit pattern into the hardware buffer, then looped over it
watching to see where the audio interface was and writing/reading data
directly ahead of it or behind it. it works, but of course it burns
CPU cycles like crazy (there is no wait for the device or anything
else, its just a continuous while loop).

Post by tomasz motylewski
Callbacks are not the most efficient there. If the data arrives 1 ms
too late I would loose the whole 10 ms period. If I still add it to
the buffer, I will loose only 1 ms and the card will play the
remaining 9 ms of sound.

this isn't a function of callbacks. its a function of the response of
the callback system to the delay. if the system notices that its
running late but goes ahead and executes anyway, the results are just
the same as you suggest for your approach. however, since there is
still a glitch in the audio stream (1ms of data is repeated/lost), a
system like JACK considers this an error.

--p

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Jaroslav Kysela

2002-11-27 14:24:16 UTC

Post by tomasz motylewski

Post by Jaroslav Kysela
In my brain, there is also totaly different solution with the zero context
switching overhead - sharing the soundcard DMA buffer among more
applications. There is only one problem: snd_pcm_rewind() implementation

This is my personal preference. In this model the only service ALSA has to
1. initial configuration/start/stop.
2. mmapable DMA buffer
3. fact and precise ioctl telling the current HW pointer in the buffer. If the
card is not queried each time, then the "last period interrupt" timestamp
should be included.

We have already this sort of timestamp - SNDRV_PCM_TSTAMP_MMAP, but it's
quite useless when we haven't a continuous timer source. Hopefully, things
will change after 2.5 when high-res timers are included.

Post by tomasz motylewski
Please stop the complication of "available/delay" etc. Just the raw pointer.
Each application knows where its application pointer is, so it can easily
calculate delay/available and decide for itself whether there was an overrun or
not.

I got your point. It would be good to let read hw_ptr and appl_ptr by
application when it operates in "don't report xruns" mode (stop_threshold
== ULONG_MAX). We can do it quite easy extending the current API.

Post by tomasz motylewski
The driver does not care at all whether applications are writing to this buffer
or not. There should be one thread (ISR, kernel or separate application) which
zeroes part of the buffer just after the card has played it.
Each application/thread should not just copy the data to DMA buffer, but _add_
to it - in this way we get very efficient mixing (zero copy). Of course here I
assume that "add" operations on volatile buffers are atomic vs. context
switches - probably not on all architectures - we may need mutexes.

Exactly, that's my idea. The question is the locking method. Mutexes are
slow. Perhaps, we can try to write a king of "user land" spinlocks which
contents is shared as the DMA ring buffer.

Post by tomasz motylewski
ALSA API should also tell the applications how big the "hot zone" is (the area
where the data is first played and then set to 0).

I think that hw_ptr is quite sufficient (application will start to write
samples from hw_ptr + period_size).

Jaroslav

-----
Jaroslav Kysela <***@suse.cz>
Linux Kernel Sound Maintainer
ALSA Project, SuSE Labs

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

James Courtier-Dutton

2002-11-27 13:07:27 UTC

I have read your comments below, and I would like to try to explain the
problems I am coming up against when writing a multi-media app.

I am not going to say that I know everything about kernel scheduling,
but for multi media applications, avoiding xruns is a major concern.
This becomes particularly important in SPDIF passthru modes, because if
one is outputting AC3 or DTS non-audio packs to an external decoder, an
xrun will corrupt an entire AC3 or DTS pack that is equivalent to
anything between 512 - 4096 PCM Sample Frames. So, loosing a single
sample will be very noticable to the user.
I am currently taking the following approach: -
Always prepare 2 audio hardware periods of sample frames in advance
inside the user app.

1) snd_pcm_wait()
2) write()
3) prepare new sample frames, then go back to (1).

Is this approach the best approach to use in order to avoid xruns ?
Using the "plug" interface does make the user app easier to write, but
is using the "plug" interface adding too much overhead so as to increase
the risk of xruns too much ?

Cheers
James

Post by Jaroslav Kysela

Post by James Courtier-Dutton

Post by Paul Davis

Post by Florian Bomers

Post by Paul Davis
the APIs that are used to write almost all audio software code in
production these days all use a callback model.

Sorry for questioning this statement. Of course we all don't have any statisti
cal data but
1) games
2) media players
3) GUI sounds (i.e. accessibility)

this is a fair point. i apologize. i have my head so far inside the
"audio software for musicians" box that i tend to fail to see such
applications :)
however, the very fact that the applications developers of such
programs really don't know much (if anything) about audio programming,
and probably don't want to know much about it either, suggests that an
API like ALSA which exposes the full HAL is probably a mistake. again,
i consider PortAudio a vastly preferable solution.

I would like to point out that a "callback" api would work just as well
as an open/write/close api for
1) games
2) media players
3) GUI sounds (i.e. accessibility)
I have to agree with Paul on the fact that a "callback" approach is really the ONLY real option.
Here is my reasoning: -
1) My perspective is from "(2) Media players" and not "Pro-Audio"
2) Sound Hardware tends to have very small buffers.
3) For nice sounding audio, these buffers should never run dry. I.E. No XRUNs.
4) A open/write/close api will never ever be able to guarantee no XRUNs, as it has no control on when it will get scheduling time to do the next write.
5) With a "callback" approach, the kernel would be notified by the sound hardware that it was ready for new samples, the kernel could then adjust the scheduler, so that the "callback" function was called ASAP.
The "callback" function then only has to decide which samples to give. If the "callback" function could receive a "delay" value from the sound hardware at each callback, a media player would then have all the information it needed to do full audio/video sync.

Sorry, it's not as easy as you've described. It's not possible to invoke
any user code from the kernel code directly. There is a scheduler which is
informed that a task has been woken up. It depends on scheduler when the
task is really invoked. It's quite same as for the r/w model where the
application is notified over poll that something occured.

Post by James Courtier-Dutton
6) I don't need "sample sync", but I do NEED "callback" based api to provide me with "no XRUNs".

I don't think that there is some difference. If the scheduler don't give
you enough time, the audio stream is somehow broken on all architectures.

Post by James Courtier-Dutton
Summary: -
The only way to cure XRUN problems is with a "callback" based api.
All application that currently use open/write/close apis, can just as easily use a "callback" api.

The callback model is good for perfect sync between applications. It can
do (and does) chaining of more sources, arbitrating (removing invalid
sources) and so on. It is simply something "over" the audio HAL. If it
really helps, it's a different point.
The discussed difference (a few months ago) was in a count of the task
context switches.
With jack, you have these context switches (daemon and two applications
jackd -*> app1 -*> app2 -*> jackd -> soundcard
With r/w model and a sample mixing server implemented in the user space,
mserver -> soundcard
-*(ring buffer does not contain enough samples)> app1
-*(ring buffer does not contain enough samples)> app2
In real real-time setup, there will be two periods (thus app1 and app2)
will be woken up all times. So, in real world the context switches are for
mserver -*> app1 -*> app2 or
mserver -*> app2 -*> app1
Note: mserver implementation uses same assumptions as jack (or any other
callback model). The ring buffers are shared between app and mserver and
the period of samples is a constant. The playback / capture pointers are
incrementing by the period size steps. Thus there is no need to commit
result directly back to the mserver. mserver will be woken up by the
kernel scheduler when next poll() event occurs (on next period boundary).
Ok, it's only simple example, that there are more solutions than Paul
suggests. I fully agree, that the callback model is suitable for the
perfect synchronization among more applications. Also, imagine that
mserver is not using a soundcard as output device, but jackd. So,
applications using r/w can use benefits of jackd (of course, there will be
one more period of samples buffered, but who will care when the start is
synchronized?).
In my brain, there is also totaly different solution with the zero context
switching overhead - sharing the soundcard DMA buffer among more
applications. There is only one problem: snd_pcm_rewind() implementation
cannot be perfect, because of wrapping added sample values (we lose
information which cannot be recovered). The question is, if it's a fatal
problem.
Let's discuss these things. Hopefully, I'll get some time to implement at
least the mserver functionality.
Jaroslav
-----
Linux Kernel Sound Maintainer
ALSA Project, SuSE Labs

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Paul Davis

2002-11-27 13:55:00 UTC

Post by James Courtier-Dutton
I am currently taking the following approach: -
Always prepare 2 audio hardware periods of sample frames in advance
inside the user app.
1) snd_pcm_wait()
2) write()
3) prepare new sample frames, then go back to (1).

for lower latency, you'd do:

1) snd_pcm_wait()
2) prepare new sample frames
3) write(), then go back to (1).

but for the kinds of things you are describing, your original order
seems OK.

--p

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Paul Davis

2002-11-27 13:42:56 UTC

Post by Jaroslav Kysela
Ok, it's only simple example, that there are more solutions than Paul
suggests. I fully agree, that the callback model is suitable for the
perfect synchronization among more applications.

Let's be totally clear about this. its not just that the callback
model is suitable - the mserver model will actually not work for
sample sync between applications. I have always been sure that the
mserver model will work well as a replacement for things like esd and
artsd. if that's all that we needed, i would never have started work
on jack but would have put my work into the mserver model.

However, for the class of users i am interested in, the mserver model
isn't adequate, and thats why jack exists.

Post by Jaroslav Kysela
Also, imagine that
mserver is not using a soundcard as output device, but jackd. So,
applications using r/w can use benefits of jackd

precisely. this is one the main reasons i'd like to see the mserver
stuff working.

Post by Jaroslav Kysela
In my brain, there is also totaly different solution with the zero context
switching overhead - sharing the soundcard DMA buffer among more
applications.

i have been thinking about this from a different perspective
recently. i recently modified ardour's internals so that the total
data flow for an audio interface looks like:

hardware buffer
-> JACK ALSA driver output port buffer
-> ardour output port buffer
-> hardware buffer

it keeps occuring to me that there is no technical reason why we have
the two intermediate copies. it would be amazing to find a way to
export the mmap'ed hardware buffer up into user space as shared
memory, tell JACK to use this for the port buffers of the ALSA jack
client, and then we can skip the copy. JACK will need some minor
modifications to its internals to permit just a single step, but i can
see how to do that. without that fix, it would still be better:

hardware buffer == JACK ALSA driver output port buffer
-> ardour output port buffer
-> JACK ALSA driver input port buffer == hardware buffer

i see this as more promising than the approach i think you are
thinking of. you can't avoid the context switching - they *have* to
happen so that the apps can run!! the question is *when* does it
happen ... in JACK, they are initiated in a chain when the interface
interrupts us. in the mserver model and/or the shared mmap'ed buffer
approach, they just have to happen sometime between interrupts
(otherwise the buffers are not handled in time). so there is no
avoiding them, its just a matter of when they happen. the point of
JACK's design is to force sample sync and to minimize latency - always
generating and processing audio as close to when it is handled by the
hardware as possible (hence the default 2 period setting). a model
that allows the context switching to occur "sometime" between
interrupts is more relaxed, but loses sample sync and slightly
increases (some kinds of) latency. that doesn't mean i think its a
stupid system, just one that lacks certain properties.

however, i do think that finding a kernel mechanism that would allow
the mmap'ed buffer to be used the way that shared memory can be used
is potentially extremely useful. even though data copying on modern
machines doesn't consume much of the cycles burnt by almost any audio
software, its still a cost that it would nice to reduce.

--p

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Kai Vehmanen

2002-11-27 15:21:39 UTC

Post by Paul Davis
Let's be totally clear about this. its not just that the callback
model is suitable - the mserver model will actually not work for
sample sync between applications. I have always been sure that the

I think this is the critical point. ALSA's smix/mserver doesn't actually
have to provide this. As Jaroslav has described, this simplifies the
implementation of the server code (as it has to provide the full ALSA PCM
API to clients), and as mentioned by Florian, this kind of server
functionality would be generally useful for a wide selection of
applications.

Post by Paul Davis
However, for the class of users i am interested in, the mserver model
isn't adequate, and thats why jack exists.

Yup, in addition to synchronous execution, JACK also provides interfaces
for connection and transport management (both of which are important
but not necessarily interesting to ALSA).

Post by Paul Davis
i see this as more promising than the approach i think you are
thinking of. you can't avoid the context switching - they *have* to
happen so that the apps can run!! the question is *when* does it
happen ... in JACK, they are initiated in a chain when the interface
interrupts us. in the mserver model and/or the shared mmap'ed buffer
approach, they just have to happen sometime between interrupts
(otherwise the buffers are not handled in time). so there is no
avoiding them, its just a matter of when they happen. the point of
JACK's design is to force sample sync and to minimize latency - always
generating and processing audio as close to when it is handled by the
hardware as possible (hence the default 2 period setting). a model
that allows the context switching to occur "sometime" between
interrupts is more relaxed, but loses sample sync and slightly
increases (some kinds of) latency. that doesn't mean i think its a
stupid system, just one that lacks certain properties.

That nicely summarizes the whole thing. I think there's room
for both servers. It should be possible to implement plugins
for both mserver-using-jack and jack-using-mserver scenarios.

Post by Paul Davis
however, i do think that finding a kernel mechanism that would allow
the mmap'ed buffer to be used the way that shared memory can be used
is potentially extremely useful. even though data copying on modern
machines doesn't consume much of the cycles burnt by almost any audio
software, its still a cost that it would nice to reduce.

Btw; I tend to think the memory bandwidth usage is a secondary issue.
Transferring data over PCI is somewhat problematic, but that's anyway
done only once per direction. But as for memory bandwidth, today's
computers supply a _lot_ of it. For instance PC133-SDRAM:1.06GB/s
and DDR333:2.7GB/s. 48 channels of 32bit data at 96kHz is only 17.5MB/s!
Of course, as we only want to spend small fraction of the callback to
memory transfers it doesn't make sense to directly compare these
figures, but they do give some perspective to the issue.

The primary issue is fast and reliable user-mode process wake-up.
Once you have that, and have deterministic audio code-paths,
the rest is just a question of how complex processing tasks you can do
on a given hardware (... and for once, just buying more hardware
can solve problems).
--
http://www.eca.cx
Audio software for Linux!

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Jaroslav Kysela

2002-11-27 16:26:07 UTC

Post by Paul Davis
i see this as more promising than the approach i think you are
thinking of. you can't avoid the context switching - they *have* to
happen so that the apps can run!! the question is *when* does it
happen ... in JACK, they are initiated in a chain when the interface
interrupts us. in the mserver model and/or the shared mmap'ed buffer
approach, they just have to happen sometime between interrupts
(otherwise the buffers are not handled in time). so there is no
avoiding them, its just a matter of when they happen. the point of
JACK's design is to force sample sync and to minimize latency - always
generating and processing audio as close to when it is handled by the
hardware as possible (hence the default 2 period setting). a model
that allows the context switching to occur "sometime" between
interrupts is more relaxed, but loses sample sync and slightly

Note that "sometime" means in same time as jack invokes the clients. The
r/w applications blocks at the r/w or poll point, so they will be
activated when the condition for the ring buffer is matched. Thus for 2
periods - conditions are same as for jack.

Anyway, as Kai stated, we are reaching a consensus that both ways are
useful for some cases.

Jaroslav

-----
Jaroslav Kysela <***@suse.cz>
Linux Kernel Sound Maintainer
ALSA Project, SuSE Labs

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Paul Davis

2002-11-27 13:45:52 UTC

Post by Jaroslav Kysela
Sorry, it's not as easy as you've described. It's not possible to invoke
any user code from the kernel code directly. There is a scheduler which is
informed that a task has been woken up. It depends on scheduler when the
task is really invoked. It's quite same as for the r/w model where the
application is notified over poll that something occured.

i think we can consider the behaviour of the kernel scheduler when it
schedules a SCHED_FIFO task after an audio interface interrupt has
woken it to be very, very close to the kind of callback system james
is describing. ditto for any kind of wakeup of a SCHED_FIFO task
(e.g. when its woken by a write to pipe, or a signal).

its not perfect, as i am very disappointed to discover, but its also
very close.

--p

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Jaroslav Kysela

2002-11-27 14:39:22 UTC

Post by Paul Davis

Post by Jaroslav Kysela
Sorry, it's not as easy as you've described. It's not possible to invoke
any user code from the kernel code directly. There is a scheduler which is
informed that a task has been woken up. It depends on scheduler when the
task is really invoked. It's quite same as for the r/w model where the
application is notified over poll that something occured.

i think we can consider the behaviour of the kernel scheduler when it
schedules a SCHED_FIFO task after an audio interface interrupt has
woken it to be very, very close to the kind of callback system james
is describing. ditto for any kind of wakeup of a SCHED_FIFO task
(e.g. when its woken by a write to pipe, or a signal).
its not perfect, as i am very disappointed to discover, but its also
very close.

If you have more processes with same priority, the time gaps might be
noticable as well.

Jaroslav

-----
Jaroslav Kysela <***@suse.cz>
Linux Kernel Sound Maintainer
ALSA Project, SuSE Labs

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

James Courtier-Dutton

2002-11-27 14:03:14 UTC

Post by James Courtier-Dutton

Post by James Courtier-Dutton
I am currently taking the following approach: -
Always prepare 2 audio hardware periods of sample frames in advance
inside the user app.
1) snd_pcm_wait()
2) write()
3) prepare new sample frames, then go back to (1).

1) snd_pcm_wait()
2) prepare new sample frames
3) write(), then go back to (1).
but for the kinds of things you are describing, your original order
seems OK.
--p

I suppose it depends on what latency we are measuring.
My first concern is avoiding xruns, and preparing new sample frames
takes quite a lot of time.
For my application, it does not really matter how long it takes for the
samples to reach the speakers, it is just that we have to know very
accurately how long it is in fact taking. (so we use the delay() function.)
The other issue is the avoiding of xruns, so I think my suggestion is
better than yours for xrun avoidance.

Cheers
James

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Jaroslav Kysela

2002-11-27 14:43:14 UTC

Post by Paul Davis

Post by tomasz motylewski
This is my personal preference. In this model the only service ALSA has to
1. initial configuration/start/stop.
2. mmapable DMA buffer
3. fact and precise ioctl telling the current HW pointer in the buffer. If the
card is not queried each time, then the "last period interrupt" timestamp
should be included.

ALSA supplies all 3 of these. You can use them by themselves without
the rest of the API if you want. The only problem arises with hardware
that cannot give you accurate h/w pointer positions.

Post by tomasz motylewski
Please stop the complication of "available/delay" etc. Just the raw
pointer. Each application knows where its application pointer is, so
it can easily calculate delay/available and decide for itself whether
there was an overrun or not.

actually, it can't. if the user space application is delayed for
precisely 1 buffer's worth of data, it will see the pointer at what
appears to the the right place and believe that no xrun has
* h/w pointer position as a steadily incrementing value
* h/w pointer position *plus* interrupt count
i favor the latter since it provides for a longer time before wrapping
becomes an issue (ULONG_MAX interrupts).

The ALSA internal code already uses hw_ptr and appl_ptr within range 0 to
boundary, where boundary is close to ULONG_MAX and expression boundary /
period_size == an integer value. So, the hardware pointer also contains
count of hardware interrupts as well.

Jaroslav

-----
Jaroslav Kysela <***@suse.cz>
Linux Kernel Sound Maintainer
ALSA Project, SuSE Labs

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Paul Davis

2002-11-27 15:04:09 UTC

Post by Jaroslav Kysela

Post by Paul Davis
actually, it can't. if the user space application is delayed for
precisely 1 buffer's worth of data, it will see the pointer at what
appears to the the right place and believe that no xrun has
* h/w pointer position as a steadily incrementing value
* h/w pointer position *plus* interrupt count
i favor the latter since it provides for a longer time before wrapping
becomes an issue (ULONG_MAX interrupts).

The ALSA internal code already uses hw_ptr and appl_ptr within range 0 to
boundary, where boundary is close to ULONG_MAX and expression boundary /
period_size == an integer value. So, the hardware pointer also contains
count of hardware interrupts as well.

i know that. my point was that wraparound happens much earlier than if
we used a two part system (pointer-in-buffer + interrupt count). this
would be a similar system to the one used by the RTC, btw. we wouldn't
wrap with a two part system for around 66 days with 64 frames/period
at 48kHz. in the current system we wrap in about 1 day or so.

it doesn't make much difference either way, i was just pointing out a
different way to do it, and more importantly, correcting the claim
that by knowing where the h/w pointer is *in the buffer*, the
application can know if there has been an xrun. the ALSA API h/w
pointer, as you point out, implicitly contains the interrupt count,
but its not as raw as was being requested.

its also worth noting that it too is not immune from missing xruns,
but there isn't anything we can do about kernel/driver code that
blocks interrupts for an entire buffer :(

--p

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Mark Swanson

2002-11-28 13:56:55 UTC

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Post by Paul Davis
its also worth noting that it too is not immune from missing xruns,
but there isn't anything we can do about kernel/driver code that
blocks interrupts for an entire buffer :(

I do not know how long an entire buffer is. I assume this will differ per
card, but how small could the worst-case possibly be?

Thanks.

- --
Schedule your world with ScheduleWorld.com
http://www.ScheduleWorld.com/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE95iCnBtoFHHwdJ/cRAqM2AKCa5BFltV8jnaLXQJkS/hJhqEQ3TgCgqDKT
SUPdKVv9CSf12eSwMd1SoY4=
=XfiU
-----END PGP SIGNATURE-----

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Paul Davis

2002-11-28 14:29:37 UTC

Post by Mark Swanson

Post by Paul Davis
its also worth noting that it too is not immune from missing xruns,
but there isn't anything we can do about kernel/driver code that
blocks interrupts for an entire buffer :(

I do not know how long an entire buffer is. I assume this will differ per
card, but how small could the worst-case possibly be?

the software gets to configure it. typical values range from 128
frames to 8192 frames, so at 48kHz, that would be 2.6ms to 170ms.

--p

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

tomasz motylewski

2002-11-27 15:47:46 UTC

Post by Paul Davis

Post by tomasz motylewski
Please stop the complication of "available/delay" etc. Just the raw
pointer. Each application knows where its application pointer is, so
it can easily calculate delay/available and decide for itself whether
there was an overrun or not.

actually, it can't. if the user space application is delayed for
precisely 1 buffer's worth of data, it will see the pointer at what
appears to the the right place and believe that no xrun has

Well, but if you combine it with the current time information, then you will
know whether the buffer has wrapped around or not - if you have 1000 ms of
buffer, but always run 20 ms from the front, then if you notice 1010 ms has
passed since the last action - then you have an overrun, even if the HW pointer
seems to be in the right place. So, configuration + mmaped buffer + gettimeoday
+ HW pointer and I am happy. And may be system call "mix data into buffer
(start, n_frames,data)" for atomic add. This provides minimum interference
between programs (if they will agree on frame rate, format, etc.).

Post by Paul Davis
for the record, i've used this approach (inspired by someone from
poland whose name i forget) with an ISA card (no PCI burst size
issues) to achieve <10 sample processing latency. i scribbled a
special bit pattern into the hardware buffer, then looped over it
watching to see where the audio interface was and writing/reading data
directly ahead of it or behind it. it works, but of course it burns
CPU cycles like crazy (there is no wait for the device or anything
else, its just a continuous while loop).

Well, I use that method for reading data - I know how much samples I need
to gather before I send them, so I may calculate how many ms to sleep. If I
read few samples too much, I will just sleep less next time. Jitter about 300
microseconds in RTP data stream which I send, and quite CPU efficient.

But how did you do for writing data to the card? Are your read and write buffer
HW synchronized?

Post by Paul Davis
this isn't a function of callbacks. its a function of the response of
the callback system to the delay. if the system notices that its
running late but goes ahead and executes anyway, the results are just
the same as you suggest for your approach. however, since there is

well, no. Lets say I have got "write callback" in time and I should give some
data to the card. But I do not have it - it may arrive in 1 ms, but then I can
only through it away, because I will get next "write callback" in 9 ms from
then.

I like callbacks for "read" direction.

Best regards,
--
Tomasz Motylewski

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

Paul Davis

2002-11-28 13:50:49 UTC

Post by tomasz motylewski

Post by Paul Davis
actually, it can't. if the user space application is delayed for
precisely 1 buffer's worth of data, it will see the pointer at what
appears to the the right place and believe that no xrun has

Well, but if you combine it with the current time information, then
you will know whether the buffer has wrapped around or not - if you
have 1000 ms of buffer, but always run 20 ms from the front, then if
you notice 1010 ms has passed since the last action - then you have
an overrun, even if the HW pointe r seems to be in the right
place. So, configuration + mmaped buffer + gettimeoda y + HW pointer
and I am happy.

you should trying playing with a system driven by varispeed word
clock. you'll soon find that gettimeofday(2) is completely useless to
you. the sample clock is the only clock that counts. say it twice
every morning when you get up :)

Post by tomasz motylewski
And may be system call "mix data into buffer (start,
n_frames,data)" for atomic add. This provides minimum interference
between programs (if they will agree on frame rate, format, etc.).

that is quite a big "if". still, the plughw layer can help with that i
suppose.

Post by tomasz motylewski
Well, I use that method for reading data - I know how much samples I need
to gather before I send them, so I may calculate how many ms to sleep. If I

if you have a kernel that will reliably sleep for 1ms, i envy you :)

Post by tomasz motylewski
But how did you do for writing data to the card? Are your read and
write buffer HW synchronized?

i used the same technique as for reading. the buffer gets scribbled
with a bit pattern and we look for the "leading edge" location with
two bytes containing the special pattern. this minimizes the problems
caused by actual sample values that match that pattern. the chances of
two in a row doing so are so small as to be mostly ignorable.

but note: this technique doesn't work so well for PCI cards because of
the bus transfer block size.

--p

-------------------------------------------------------
This SF.net email is sponsored by: Get the new Palm Tungsten T
handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0002en

28 Replies
2 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Mark Swanson 2002-11-26 01:16:47 UTC

Paul Davis 2002-11-26 03:19:05 UTC

Mark Swanson 2002-11-26 03:54:46 UTC

Paul Davis 2002-11-26 04:13:35 UTC

Frans Ketelaars 2002-11-26 08:13:59 UTC

Mark Swanson 2002-11-26 12:33:09 UTC

Florian Bomers 2002-11-26 19:32:40 UTC

Paul Davis 2002-11-26 23:11:18 UTC

Florian Bomers 2002-11-27 00:54:02 UTC

James Courtier-Dutton 2002-11-27 01:06:10 UTC

Jaroslav Kysela 2002-11-27 09:29:28 UTC

tomasz motylewski 2002-11-27 11:36:31 UTC

James Courtier-Dutton 2002-11-27 13:21:15 UTC

Paul Davis 2002-11-27 13:53:49 UTC

Jaroslav Kysela 2002-11-27 14:24:16 UTC

James Courtier-Dutton 2002-11-27 13:07:27 UTC

Paul Davis 2002-11-27 13:55:00 UTC

Paul Davis 2002-11-27 13:42:56 UTC

Kai Vehmanen 2002-11-27 15:21:39 UTC

Jaroslav Kysela 2002-11-27 16:26:07 UTC

Paul Davis 2002-11-27 13:45:52 UTC

Jaroslav Kysela 2002-11-27 14:39:22 UTC

James Courtier-Dutton 2002-11-27 14:03:14 UTC

Jaroslav Kysela 2002-11-27 14:43:14 UTC

Paul Davis 2002-11-27 15:04:09 UTC

Mark Swanson 2002-11-28 13:56:55 UTC

Paul Davis 2002-11-28 14:29:37 UTC

tomasz motylewski 2002-11-27 15:47:46 UTC

Paul Davis 2002-11-28 13:50:49 UTC

about - legalese

Loading...