Discussion:
[Jack-Devel] Fwd: connecting to JACKD2 with low buffer sizes
Christopher Obbard
2018-03-25 10:32:03 UTC
Permalink
Hi Guys,

Running jack with a small buffer (-p64 -n2), and connecting with any
client causes issues.
With higher buffer sizes, all is OK.

This is on an ARM embedded system, with a single core 1MHz processor.
I've set the cpu governor to performance.

I have a custom compiled 4.14 kernel with omap2plus_defconfig with the
CONFIG_PREEMPT_VOLUNTARY=y
HZ_100=y
CONFIG_NO_HZ_IDLE=y
$ jackd -R -P95 -dalsa -dhw:0 -r48000 -p64 -n2
JackPosixProcessSync::LockedTimedWait error usec = 5000000 err = Connection timed out
Driver is not running
Cannot create new client
CheckSize error size = 32 Size() = 12
CheckRead error
CheckSize error size = -1 Size() = 4
CheckRead error
CheckSize error size = 0 Size() = 12
CheckRead error
Also, with higher buffer sizes sometimes I get xruns. After enabling
ALSA: PCM: [Q] Lost interrupts?: (stream=0, delta=221, new_hw_ptr=181343, old_hw_ptr=181122)
Loosing the FIFO interrupts doesn't seem like a great thing.........


I _think_ the issue is to do with the kernel scheduling, but this why
I ask the experts :-).
Can anyone suggest any kernel settings that will improve my situation?


Many thanks,

Chris
Robert Bielik
2018-03-26 07:45:45 UTC
Permalink
Hi Chris,

Not sure how much this helps, but I have been running the audioinjector octocard (6 in/8 out) on a RPi 3, with stable streaming down to 8 frames per buffer, with a distribution called RealtimePi (https://github.com/guysoft/RealtimePi) Unfortunately I don't know the specific kernel details of it.

My exact setup can be found here: http://forum.audioinjector.net/viewtopic.php?f=5&t=2727&start=30#p5749

Problem of course is that the lower the buffer size, the higher the CPU utilization. With 64 frames per buffer, the latency/CPU tradeoff is acceptable (I think with 8 frame per buffer, I had nearly 100% CPU just shoveling audio from input to output, with 64 it's ~10%).

Regards
/Robert
-----Original Message-----
Christopher Obbard
Sent: den 25 mars 2018 12:32
Subject: [Jack-Devel] Fwd: connecting to JACKD2 with low buffer sizes
Hi Guys,
Running jack with a small buffer (-p64 -n2), and connecting with any
client causes issues.
With higher buffer sizes, all is OK.
This is on an ARM embedded system, with a single core 1MHz processor.
I've set the cpu governor to performance.
I have a custom compiled 4.14 kernel with omap2plus_defconfig with the
CONFIG_PREEMPT_VOLUNTARY=y
HZ_100=y
CONFIG_NO_HZ_IDLE=y
$ jackd -R -P95 -dalsa -dhw:0 -r48000 -p64 -n2
JackPosixProcessSync::LockedTimedWait error usec = 5000000 err =
Connection timed out
Driver is not running
Cannot create new client
CheckSize error size = 32 Size() = 12
CheckRead error
CheckSize error size = -1 Size() = 4
CheckRead error
CheckSize error size = 0 Size() = 12
CheckRead error
Also, with higher buffer sizes sometimes I get xruns. After enabling
ALSA: PCM: [Q] Lost interrupts?: (stream=0, delta=221,
new_hw_ptr=181343, old_hw_ptr=181122)
Loosing the FIFO interrupts doesn't seem like a great thing.........
I _think_ the issue is to do with the kernel scheduling, but this why
I ask the experts :-).
Can anyone suggest any kernel settings that will improve my situation?
Many thanks,
Chris
_______________________________________________
Jack-Devel mailing list
http://lists.jackaudio.org/listinfo.cgi/jack-devel-jackaudio.org
Chris Caudle
2018-03-26 13:40:21 UTC
Permalink
Post by Christopher Obbard
This is on an ARM embedded system, with a single core 1MHz processor.
Is that a typo or are you really running on a 1MHz processor? If so that
would be a really slow clock. That would be two orders of magnitude
slower than most M class microcontrollers.
Post by Christopher Obbard
I have a custom compiled 4.14 kernel with omap2plus_defconfig with the
CONFIG_PREEMPT_VOLUNTARY=y
I would not consider CONFIG_PREEMPT_VOLUNTARY appropriate for low latency
use, and especially if you are really running a 1MHz processor. Using the
full RT patch set would be recommended, or at the very least
CONFIG_PREEMPT.
Post by Christopher Obbard
$ jackd -R -P95 -dalsa -dhw:0 -r48000 -p64 -n2
JackPosixProcessSync::LockedTimedWait error usec = 5000000 err = Connection timed out
Driver is not running
Cannot create new client
CheckSize error size = 32 Size() = 12
CheckRead error
CheckSize error size = -1 Size() = 4
CheckRead error
CheckSize error size = 0 Size() = 12
CheckRead error
What audio hardware are you running? If those messages all came right at
startup from jack it would seem that the parameters don't match something,
but jack usually gives better error messages than that. There were never
any messages indicating that the requested sample rate was actually used,
word length used, etc?
You usually should see some messages like this at the end of jackd startup:

configuring for 48000Hz, period = 1024 frames (21.3 ms), buffer = 3 periods
ALSA: final selected sample format for capture: 24bit little-endian in
3bytes format
ALSA: use 3 periods for capture
ALSA: final selected sample format for playback: 24bit little-endian in
3bytes format
ALSA: use 3 periods for playback

Missing those messages could perhaps indicate that jackd was not able to
open the ALSA device at all.
Are you using an ALSA driver from the current kernel tree?
Post by Christopher Obbard
Also, with higher buffer sizes sometimes I get xruns.
You are running PREEMPT_VOLUNTARY on a 1MHz processor, a more typical
configuration for low latency use is running PREEMPT or PREEMPT_RT on an
1800MHz to 3000MHz processor, so it is a little bit hard to know where to
start.
If you are really running a 1MHz processor, there is only time for 667
instructions for each 32 sample buffer at 48kHz rate. That is not very
much for a general purpose OS.
Post by Christopher Obbard
ALSA: PCM: [Q] Lost interrupts?: (stream=0, delta=221,
new_hw_ptr=181343, old_hw_ptr=181122)
I would start with a better scheduler, PREEMPT or preferably PREEMPT_RT.
The usual advice these days is that you don't really need PREEMPT_RT, but
that is usually for someone running a multi GHz processor and running at a
period size 256 samples or larger. For 64 or 32 sample period size on a
slow processor you are going to have to get aggressive with optimizing.
--
Chris Caudle
Christopher Obbard
2018-03-26 14:38:44 UTC
Permalink
Hi Chris,
Post by Chris Caudle
Post by Christopher Obbard
This is on an ARM embedded system, with a single core 1MHz processor.
Is that a typo or are you really running on a 1MHz processor? If so that
would be a really slow clock. That would be two orders of magnitude
slower than most M class microcontrollers.
Big typo! I am running on a TI 1GHz AM3358 processor.
Post by Chris Caudle
Post by Christopher Obbard
I have a custom compiled 4.14 kernel with omap2plus_defconfig with the
CONFIG_PREEMPT_VOLUNTARY=y
I would not consider CONFIG_PREEMPT_VOLUNTARY appropriate for low latency
use, and especially if you are really running a 1MHz processor. Using the
full RT patch set would be recommended, or at the very least
CONFIG_PREEMPT.
I've been using CONFIG_PREEMPT also in other places, I have been
wondering whether the full RT patch will cause less throughput for the
jack process.

Both jack and my application are requesting SCHED_FIFO, I am not sure
of the priority of the application but I am thinking of setting them
both to 70-80.

What about HZ, I currently have this set to HZ_100. Would HZ_1000 be any better?
Currently I have CONFIG_NO_HZ_IDLE set. Would CONFIG_HZ_PERIODIC be
any better? I doubt it, as my processor is never idle :-).
Can you suggest any other options that may improve things?
Post by Chris Caudle
What audio hardware are you running?
Are you using an ALSA driver from the current kernel tree?
It's an 6-channel in, 6-channel out card with a simple ALSA driver
written by me. The driver just binds the codecs to the CPU, nothing
latent in there. All of the FIFO stuff is taken care of by the TI
McASP driver.
Currently the driver reads & writes to 8 channels but two of the out
channels are unused, I can possibly try to gain some performance here.
Post by Chris Caudle
Post by Christopher Obbard
$ jackd -R -P95 -dalsa -dhw:0 -r48000 -p64 -n2
JackPosixProcessSync::LockedTimedWait error usec = 5000000 err = Connection timed out
Driver is not running
Cannot create new client
CheckSize error size = 32 Size() = 12
CheckRead error
CheckSize error size = -1 Size() = 4
CheckRead error
CheckSize error size = 0 Size() = 12
CheckRead error
If those messages all came right at
startup from jack it would seem that the parameters don't match something,
but jack usually gives better error messages than that. There were never
any messages indicating that the requested sample rate was actually used,
word length used, etc?
Sorry, I missed out the first part of the log. The LockedTimedWait
error came when my client tried to connect. I think it's to do with
the Kernel scheduling.
Post by Chris Caudle
Missing those messages could perhaps indicate that jackd was not able to
open the ALSA device at all.
It manages to open and all performs fine at 128 frames.
Post by Chris Caudle
Post by Christopher Obbard
Also, with higher buffer sizes sometimes I get xruns.
You are running PREEMPT_VOLUNTARY on a 1MHz processor, a more typical
configuration for low latency use is running PREEMPT or PREEMPT_RT on an
1800MHz to 3000MHz processor, so it is a little bit hard to know where to
start.
If you are really running a 1MHz processor, there is only time for 667
instructions for each 32 sample buffer at 48kHz rate. That is not very
much for a general purpose OS.
Sorry, typo'd again here!
Post by Chris Caudle
Post by Christopher Obbard
ALSA: PCM: [Q] Lost interrupts?: (stream=0, delta=221,
new_hw_ptr=181343, old_hw_ptr=181122)
I would start with a better scheduler, PREEMPT or preferably PREEMPT_RT.
The usual advice these days is that you don't really need PREEMPT_RT, but
that is usually for someone running a multi GHz processor and running at a
period size 256 samples or larger. For 64 or 32 sample period size on a
slow processor you are going to have to get aggressive with optimizing.
Thanks. This is a good starting point.

Really I need to choose whether I run PREEMPT or PREEMPT_RT, I will
probably just go for PREEMPT_RT.
I've not had much luck with PREEMPT_RT in the past, I don't think I've
set the priorities of interrupts properly.

I think I need to chrt the soundcard IRQ highest, then JACK, then my
application....
AFAIK, jack already does this with the -R argument & sets the
scheduler priority to SCHED_FIFO.


Have you got any links to information, books etc on RT patch? I've
read a few and most just seem to discuss how to patch the kernel,
without any real-life examples!


Thanks again for the very useful comments,

Chris
Chris Caudle
2018-03-26 19:06:24 UTC
Permalink
Post by Christopher Obbard
Big typo! I am running on a TI 1GHz AM3358 processor.
Ok, three orders of magnitude is a pretty significant difference. So is
this a BeagleBone Black or something similar?
If so I run the RT kernels from the RCN repo, never had any problem with
those so might be a good starting point.
Post by Christopher Obbard
I've been using CONFIG_PREEMPT also in other places, I have been
wondering whether the full RT patch will cause less throughput for
the jack process.
The design of jackd is not about throughput, it is about low latency. If
you want better throughput, use larger period sizes. That is not really
specific to jack, that is just a general principle in computing, if you
want the best throughput using a general purpose computer design you
should bundle your data into larger groups so that the processor has to
handle fewer interrupts and the DMA controller(s) can run in the most
efficient configuration. If you want the lowest latency you will have to
trade off some throughput for that.
If you want really low latency and high throughput you will likely have to
design custom hardware with pipelined processing (e.g. some of the
cut-through switch designs used for Infiniband and very low latency HPC
Ethernet switches). That is a more advanced topic than how to get jackd
running on an off the shelf processor.
Post by Christopher Obbard
Both jack and my application are requesting SCHED_FIFO, I am not sure
of the priority of the application but I am thinking of setting them
both to 70-80.

Why not start off reading the linux audio and jack FAQs that explain how
you should configure your RT system? It sounds like you are getting ready
to re-discover all of that information the long and hard way. Might be
entertaining and educational but probably is no the best use of your time.
http://jackaudio.org/faq/linux_rt_config.html
http://jackaudio.org/faq/linux_group_sched.html

I thought there was a page somewhere with suggestions on RT priority
settings, but I do not see it on the jackaudio FAQ or wiki pages. The
rtirq script from Rui might be useful:
http://www.rncbc.org/jack/
(link is toward the bottom)

There may be some useful info here:
https://wiki.linuxaudio.org/wiki/system_configuration
but unfortunately the low latency page is from 2000, there have been quite
a few changes in kernel setup since then, so I don't know how useful the
low latency wiki page actually is on linuxaudio.org these days.

The short version is you want the interrupt for your sound hardware to be
highest, then jack right after that. There probably won't be any other RT
tasks on a single use system, so it probably won't matter whether you set
the sound IRQ and jackd priorities to 90 and 89, or 70 and 69, just make
sure they are higher than the default for non-RT tasks, which I think is
50.

The applications using jackd actually run the RT audio thread in jackd
context, so unless I misunderstood something it should not in principle
matter what the priority of the (main thread) of the other applications is
set.
Post by Christopher Obbard
What about HZ, I currently have this set to HZ_100. Would HZ_1000 be any
better?

Could be better for scheduling latency, but for IRQ driven tasks like
handling the sound hardware I don't know if it matters or not.
Post by Christopher Obbard
Post by Chris Caudle
What audio hardware are you running?
Are you using an ALSA driver from the current kernel tree?
It's an 6-channel in, 6-channel out card with a simple ALSA driver
written by me.

There appears to be a TI McASP driver in the kernel tree already, any
reason you are not using that? Just asking, I have not tried to base
anything on that driver yet myself.
http://processors.wiki.ti.com/index.php/Sitara_Linux_Audio_Driver_Overview

From that page it appears you would just need to write the codec driver
that handles specifics of the platform driver settings needed, and any
control interfaces.
Post by Christopher Obbard
The driver just binds the codecs to the CPU, nothing latent in there.
All of the FIFO stuff is taken care of by the TI McASP driver.
OK, it sounds like maybe that is actually what you did, just wrote the
codec portion and used the existing McASP driver to handle the I2S/TDM
interface pieces.
Post by Christopher Obbard
Currently the driver reads & writes to 8 channels but two of the
out channels are unused, I can possibly try to gain some
performance here.
Not worth the trouble at this point. If you get a system that is mostly
working and just has an occasional problem it might be worth optimizing,
but it sounds like right now the system is pretty broken, so look for big
problems, not small optimizations you could make.
Post by Christopher Obbard
Sorry, I missed out the first part of the log. The LockedTimedWait
error came when my client tried to connect. I think it's to do
with the Kernel scheduling.
Make sure you have RT scheduling enabled for whatever user account you are
using, try again and get the entire startup log.
Post by Christopher Obbard
It manages to open and all performs fine at 128 frames.
OK, then make sure you are really clear about when you are trying to
debug problems with e.g. 32 frame setting vs. problems with the 128 frames
setting, since it looks like you have different problems with the
different period sizes.
Post by Christopher Obbard
I've not had much luck with PREEMPT_RT in the past, I don't
think I've set the priorities of interrupts properly.
Then start with Rui's rtirq script. Some of the devices in the default
script are x86 specific, and even some of those are probably outdated, but
the framework is great, put the devices you want in the configuration file
and run the script, it will set everything for you.
Post by Christopher Obbard
I think I need to chrt the soundcard IRQ highest, then JACK, then my
application....
Post by Christopher Obbard
AFAIK, jack already does this with the -R argument & sets the
scheduler priority to SCHED_FIFO.
I don't think the application RT priority matters, if I understand
correctly jackd will be allocating the thread that the application uses
for audio callback, so that thread of the application should run at jackd
priority, so for a simple embedded system you would just need to set the
sound hardware IRQ the highest, then jackd after that (with the -R
argument, default values is probably OK), then everything else can run at
non-RT default. I think non-RT default is 50 and jackd default is 70
based on what I have seen on x86 machines.
Post by Christopher Obbard
Have you got any links to information, books etc on RT patch?
Just this, but honestly the original jackd developers took care of almost
everything for you, you shouldn't need to know much about the low level
details just to get it running.
https://wiki.linuxfoundation.org/realtime/documentation/start
--
Chris Caudle
Christopher Obbard
2018-03-26 22:12:23 UTC
Permalink
Hi Chris,
Post by Chris Caudle
Post by Christopher Obbard
Big typo! I am running on a TI 1GHz AM3358 processor.
Ok, three orders of magnitude is a pretty significant difference. So is
this a BeagleBone Black or something similar?
Yeah, BeagleBone Black. It's quite a nice board!
Post by Chris Caudle
If so I run the RT kernels from the RCN repo, never had any problem with
those so might be a good starting point.
Ah, I've been running mainline stable 4.14, applied CONFIG_PREEMPT and
ran a cyclictest on three threads with no extra load. It was giving me
"ok" average latency, and terrible maximum latency. It happened to be
every few seconds there would be a massive spike. No wonder I've been
stumped !!!

using 4.14-rt branch from https://github.com/beagleboard/linux/, the
same cyclictest without load was giving me a maximum of 46. So this is
looking promising!

Started jackd, all looking OK.
Can get jack periods down to 64 now, even 32! But some dreaded errors occur.
This time I have a printk message of "GBLCTL write error" every few us.
Good news! Looks like a bug in the mcasp driver.
Message didn't appear on Mainline, it's getting late here so time to
squish tomorrow.
I tried my client with the jackd -adummy, I know it's not a real test
but it managed to run down to 32 frames.

Lesson learnt: use vendor BSP (unless vendor is Chinese. That has
bitten me before ;-) )
Post by Chris Caudle
Post by Christopher Obbard
I've been using CONFIG_PREEMPT also in other places, I have been
wondering whether the full RT patch will cause less throughput for
the jack process.
The design of jackd is not about throughput, it is about low latency. If
you want better throughput, use larger period sizes. That is not really
specific to jack, that is just a general principle in computing, if you
want the best throughput using a general purpose computer design you
should bundle your data into larger groups so that the processor has to
handle fewer interrupts and the DMA controller(s) can run in the most
efficient configuration. If you want the lowest latency you will have to
trade off some throughput for that.
If you want really low latency and high throughput you will likely have to
design custom hardware with pipelined processing (e.g. some of the
cut-through switch designs used for Infiniband and very low latency HPC
Ethernet switches). That is a more advanced topic than how to get jackd
running on an off the shelf processor.
Thanks for your detailed explanation! It really helps for me, been
diving deep into the kernel for the past year or so.
Been software/electronic engineer at a higher level for a few years,
but this is just super interesting to me.
Post by Chris Caudle
Post by Christopher Obbard
Both jack and my application are requesting SCHED_FIFO, I am not sure
of the priority of the application but I am thinking of setting them
both to 70-80.
Why not start off reading the linux audio and jack FAQs that explain how
you should configure your RT system? It sounds like you are getting ready
to re-discover all of that information the long and hard way. Might be
entertaining and educational but probably is no the best use of your time.
Yeah, the system is all setup for Audio to have realtime. I've managed
to set similar stuff up on a number of different AMD64 systems before,
but not so much on ARM embedded boards.
Post by Chris Caudle
I thought there was a page somewhere with suggestions on RT priority
settings, but I do not see it on the jackaudio FAQ or wiki pages. The
http://www.rncbc.org/jack/
(link is toward the bottom)
Yeah, the rtirq script is handy. It will need some tweaking for arm
boards, I think.
Post by Chris Caudle
https://wiki.linuxaudio.org/wiki/system_configuration
but unfortunately the low latency page is from 2000, there have been quite
a few changes in kernel setup since then, so I don't know how useful the
low latency wiki page actually is on linuxaudio.org these days.
This is somewhere I want to improve docs after figuring it out a bit better :-).
Post by Chris Caudle
The short version is you want the interrupt for your sound hardware to be
highest, then jack right after that. There probably won't be any other RT
tasks on a single use system, so it probably won't matter whether you set
the sound IRQ and jackd priorities to 90 and 89, or 70 and 69, just make
sure they are higher than the default for non-RT tasks, which I think is
50.
The applications using jackd actually run the RT audio thread in jackd
context, so unless I misunderstood something it should not in principle
matter what the priority of the (main thread) of the other applications is
set.
Thanks, that kind of matches what I was thinking.
Post by Chris Caudle
Post by Christopher Obbard
What about HZ, I currently have this set to HZ_100. Would HZ_1000 be any
better?
Could be better for scheduling latency, but for IRQ driven tasks like
handling the sound hardware I don't know if it matters or not.
the TI defconfig for RT sets 250Hz, and it seems to do OK.
Post by Chris Caudle
Post by Christopher Obbard
Post by Chris Caudle
What audio hardware are you running?
Are you using an ALSA driver from the current kernel tree?
It's an 6-channel in, 6-channel out card with a simple ALSA driver
written by me.
There appears to be a TI McASP driver in the kernel tree already, any
reason you are not using that? Just asking, I have not tried to base
anything on that driver yet myself.
http://processors.wiki.ti.com/index.php/Sitara_Linux_Audio_Driver_Overview
From that page it appears you would just need to write the codec driver
that handles specifics of the platform driver settings needed, and any
control interfaces.
Post by Christopher Obbard
The driver just binds the codecs to the CPU, nothing latent in there.
All of the FIFO stuff is taken care of by the TI McASP driver.
OK, it sounds like maybe that is actually what you did, just wrote the
codec portion and used the existing McASP driver to handle the I2S/TDM
interface pieces.
Exactly that, nothing super clever :-).
Post by Chris Caudle
Post by Christopher Obbard
Currently the driver reads & writes to 8 channels but two of the
out channels are unused, I can possibly try to gain some
performance here.
Not worth the trouble at this point. If you get a system that is mostly
working and just has an occasional problem it might be worth optimizing,
but it sounds like right now the system is pretty broken, so look for big
problems, not small optimizations you could make.
This is the kind of experienced knowledge that is really helpful to me!
Post by Chris Caudle
Post by Christopher Obbard
Sorry, I missed out the first part of the log. The LockedTimedWait
error came when my client tried to connect. I think it's to do
with the Kernel scheduling.
Make sure you have RT scheduling enabled for whatever user account you are
using, try again and get the entire startup log.
Post by Christopher Obbard
It manages to open and all performs fine at 128 frames.
OK, then make sure you are really clear about when you are trying to
debug problems with e.g. 32 frame setting vs. problems with the 128 frames
setting, since it looks like you have different problems with the
different period sizes.
Post by Christopher Obbard
I've not had much luck with PREEMPT_RT in the past, I don't
think I've set the priorities of interrupts properly.
Then start with Rui's rtirq script. Some of the devices in the default
script are x86 specific, and even some of those are probably outdated, but
the framework is great, put the devices you want in the configuration file
and run the script, it will set everything for you.
Post by Christopher Obbard
I think I need to chrt the soundcard IRQ highest, then JACK, then my
application....
Post by Christopher Obbard
AFAIK, jack already does this with the -R argument & sets the
scheduler priority to SCHED_FIFO.
I don't think the application RT priority matters, if I understand
correctly jackd will be allocating the thread that the application uses
for audio callback, so that thread of the application should run at jackd
priority, so for a simple embedded system you would just need to set the
sound hardware IRQ the highest, then jackd after that (with the -R
argument, default values is probably OK), then everything else can run at
non-RT default. I think non-RT default is 50 and jackd default is 70
based on what I have seen on x86 machines.
That makes sense!
Post by Chris Caudle
Post by Christopher Obbard
Have you got any links to information, books etc on RT patch?
Just this, but honestly the original jackd developers took care of almost
everything for you, you shouldn't need to know much about the low level
details just to get it running.
https://wiki.linuxfoundation.org/realtime/documentation/start
Hopefully, the bug I have uncovered with the GBLCTL was making me
loose the plot a little bit :-)


Really appreciate your in-depth replies.


Cheers!

Chris
Chris Caudle
2018-03-26 23:08:54 UTC
Permalink
Post by Christopher Obbard
Ah, I've been running mainline stable 4.14, applied CONFIG_PREEMPT and
ran a cyclictest on three threads with no extra load. It was giving me
"ok" average latency, and terrible maximum latency.
That is exactly the point of the RT patches. "Most" of the time latency
is OK with the standard configuration options. If you don't mind losing
some data occasionally the standard kernel config options are probably OK.
Post by Christopher Obbard
This time I have a printk message of "GBLCTL write error" every few us.
Every few microseconds? I took a quick look at the McASP documentation
and that register looks like a control register you would setup once when
starting the driver, and then not touch it again until you had to
reconfigure. If it is being written more than a couple of times something
is wrong. It is possible that something is triggering error handling to
kick in, it looks like that register has to be written again in certain
cases where clock synch is lost, or possibly lost. Without knowing more
details of how you configured clocking and what your hardware looks like I
can't say much more than that could be something to investigate.
Post by Christopher Obbard
Lesson learnt: use vendor BSP
Or Robert Nelson's (RCN), his kernel builds are probably at least as good
as TI and sometimes has additional features included.
Post by Christopher Obbard
Yeah, the rtirq script is handy. It will need some tweaking for arm
boards, I think.
Should only need a change to which devices are included in the
RTIRQ_NAME_LIST string in /etc/sysconfig/rtirq (or maybe /etc/rtirq, I
don't remember where it would go in Debian).


best regards,
Chris Caudle
Christopher Obbard
2018-03-27 23:18:50 UTC
Permalink
Hi Chris,

Success tonight!

Jack running nicely with -r4800 -p64 -n2 -s
The softmode flag I will talk about later...
Post by Chris Caudle
Post by Christopher Obbard
Ah, I've been running mainline stable 4.14, applied CONFIG_PREEMPT and
ran a cyclictest on three threads with no extra load. It was giving me
"ok" average latency, and terrible maximum latency.
That is exactly the point of the RT patches. "Most" of the time latency
is OK with the standard configuration options. If you don't mind losing
some data occasionally the standard kernel config options are probably OK.
I was just pointing out the latency difference between Mainline and TI
kernel with RT patch applied.
Post by Chris Caudle
Post by Christopher Obbard
This time I have a printk message of "GBLCTL write error" every few us.
Every few microseconds? I took a quick look at the McASP documentation
and that register looks like a control register you would setup once when
starting the driver, and then not touch it again until you had to
reconfigure. If it is being written more than a couple of times something
is wrong. It is possible that something is triggering error handling to
kick in, it looks like that register has to be written again in certain
cases where clock synch is lost, or possibly lost. Without knowing more
details of how you configured clocking and what your hardware looks like I
can't say much more than that could be something to investigate.
More digging (just put some printks into the kernel), it turns out
this repeated GBLCTL error is actually caused by jack xruns
automatically trying to reset the soundcard.

This repeats over & over while jack's softmode is enabled. This only
happens with really tiny period values, with 256 and above it trys to
restart the card once or twice.

[ 748.951772] AUDIOTEST davinci_mcasp_start stream: 0
[ 748.956704] AUDIOTEST mcasp_start_tx
[ 748.960317] AUDIOTEST davinci_mcasp_start stream: 1
[ 748.965219] AUDIOTEST mcasp_start_rx
[ 748.969932] AUDIOTEST GBLCTL write error reg: 96 val: 1
[ 748.976214] AUDIOTEST davinci_mcasp_stop stream: 0
[ 748.976420] davinci-mcasp 48038000.mcasp: Transmit buffer underflow
[ 748.987335] AUDIOTEST mcasp_stop_tx
[ 748.990847] AUDIOTEST davinci_mcasp_stop stream: 1
[ 748.995659] AUDIOTEST mcasp_stop_rx
[ 748.999212] davinci-mcasp 48038000.mcasp: unhandled rx event.
rxstat: 0x00000104


I think the problem is due to the following function call happening
when Jack's started:

my_driver_hw_params()
Post by Chris Caudle
sets cpu output format
sets cpu sysclk
sets cpu tdm slot
sets codec sysclk & pll path
davinci_mcasp_start()
Post by Chris Caudle
calls mcasp_start_tx() -- all OK
calls mcasp_start_rx()
calls mcasp_set_ctl_reg(DAVINCI_MCASP_GBLCTLR_REG, RXHCLKRST)
-- craps out setting the receive clock. This is where the GBLCTL error
message is coming from.

This then causes an xrun in jack, calling davinchi_mcasp_stop() and
then repeating all over again.

Jack with the soft flag doesn't restart the alsa driver, so all is OK
in the world there. Until there is a genuine xrun!

I've verified the bitstream looks OK on logic analyser.

Isn't software fun? :-)


Cheers!

Chris

Loading...