Discussion:
[Cerowrt-devel] notes on going for a stable release
Dave Taht
2014-01-14 06:07:50 UTC
Permalink
I am in strong agreement that cerowrt is close to being ready for a
stable release.

However there are problems...

* Gating factors
** Sync with opewrt's release schedule

I have not been tracking what openwrt's plan is for releasing a stable
version of "Barrier Breaker". Attitude Adjustment (AA) got stalled on
finding a stable maintainer, and took a really long time to become
stable after that.

** Native IPv6 and dhcpv6-pd support
I am very happy with comcast's huge rollout of ipv6 across their
network. (over 25% of their base now) http://www.comcast6.net .

Hearing that cero isn't working on comcast anymore bugs me. I don't
seem to have ipv6 on my directly controlled comcast nodes. Yet.

Setting up a dhcpv6-pd server and some testing is required to make
sure it isn't cerowrt that's busted. I'd like to not go another year
without ipv6.

** Instruction traps

The instruction trap problem has resurfaced on boot. It is unknown
what triggers it. It doesn't happen very much after boot in my limited
testing.

The last time it bit me was on doing tests on a busy ipv6-enabled
network where it thoroughly blew up the tests. (even when not doing
ipv6 itself) It also made cerowrt unreliable.

***@davedesk:~# cd /sys/kernel/debug/mips/
***@davedesk:/sys/kernel/debug/mips# cat unaligned_instructions
7884

What values do you see, both on boot and after some uptime?

For more details on how to actually fix the bug:
http://www.bufferbloat.net/issues/419
** IPv6 vs THC

Go blow up ipv6 some more, please, and look at instruction traps...

and run stuff from this

https://www.thc.org/download.php?t=r&f=thc-ipv6-2.5.tar.gz

** src/dst routing via babels

In the last (3.10.24 dev release I switched to babels from quagga.
Either nobody but me uses babel (?), or it "just worked". That said,
the whole point of doing that was to be able to test multiple exit
nodes with tcp and mptcp (ipv4 and ipv6) and packet encapsulations
(6rd, native, 6in4)... and I haven't got round2it.

** Random number testing

I forget when I slammed in ralf's improved mips random number stuff,
but it's in cero, and should be verified if it is working properly
before being pushed to openwrt or shipped. Ted's other random stuff
was already picked up by openwrt.

** Refresh of some test packages

shaperprobe and uftp4 need an update in particular. I havent been
tracking what else needs an update.

* DEFER to next release cycle thoughts
** IWL crash

The iwl (http://www.iwl.com) test suite caused a sporadic reboot of
cerowrt back in november. as I haven't been able to reproduce it, I
can let this pass. (I did witness the semi-repeatable crash with my
own two eyes)

** make-wifi-fast
There is a huge amount of work that can now be done to improve wifi
performance. This will be the year of make-wifi-fast. I would very much
like to declare a 3.10.X cerowrt stable and go off and work in x86
and arm land for
a while to hack at 802.11ac on the ath10k and 802.11n ath9k.
Perhaps "make-wifi-fast" would be something that sells to funders
better than "fixing bufferbloat" or "fixing security problems".

** dnsmasq + dnssec support
dnsmasq 2.69test3 has sortof working dnssec support, and I'd like
to start testing that soon.

* Hot topics

** What is a stable release?

To me a "stable release" is something that has been extensively
tested, benchmarked, and will have a series of updates and security
fixes for 1-2 years. It has a maintainer, a bug database, a means for
dealing with major security issues, and so on.

** What is CeroWrt?

Originally intended to prove out a bunch of AQM and scheduling ideas,
it's done that. We proved dnssec was feasible, and simon kelly is
doing that. ISC and openwrt got signed updates working recently, the
only major update-in-the-field problem for openwrt is on updating
kernels.

CeroWrt is ALSO useful for day-to-day use, presently.

** CeroWrt could use a non-profit foundation
although we have achieved stable hosting with ISC for the next
year, we've been unable to find a permanent "home" for the project, or
funding for it or bufferbloat.net.

** CeroWrt needs a new maintainer
After 2 years of doing this my interest in solving "cross
compilation problem of the week", has declined considerably. I burn
out frequently, and only recover when I'm driven by the contributions
of you, the cerowrt-devel folk. I was delighted by the increase in
interest and in the new stuff that arrived from everybody on the sqm
front over the last couple months - but my own motivation was in
seeing sqm-scripts and the gui pushed up to openwrt, not so much on
making Cero usable by your mom.

Right now I try to dedicate only my sundays and early monday mornings
(many openwrt contributions land on sundays in the wee hours on German
time) to it now. In the early days it was 4-5 days a week, in addition
to running bufferbloat.net. Obviously progress has slowed as I have
slowed. It would get worse if I also had to maintain a stable release.

I strongly believe in the CI cycle that cerowrt does, at least once a
week, integrating changes from upstream and at the very least, compile
testing. This nips small problems in the bud before they become big
ones.

My interest in maintaining a "stable" release as well as continuing
development is slim.
I would like my sundays back. I would like to be able to work on 5
rfcs, some new code replacing htb, better analysis and qa tools, and a
couple papers... and also my day job is very different from
maintaining cero, and that is what is putting food on the table now.

Without stable funding, a non-burnt-out maintainer, and a non-profit
setup to manage the org, I don't know what the future could hold for
cero as an independent OS. Certainly I feel the pains of you, bruce,
the ISPs and vendors for the costs of continual maintenance and
security vigilance:

https://plus.google.com/u/0/+JimGettys/posts/SprUcpmDa1W

But who should pay to keep the internet edge working right? I don't
think it can be done by volunteers. What would happen if the NYT
covered cero and there were 10,000 new users all at once?

at the current ~$140/mo in donations cerowrt and bufferbloat.net are
not viable entities. I've been working with a new nonprofit that
daydreams of using kickstarter campaigns to get stuff done, but I
think the appeal of running a kickstarter to pay for a maintainer for
a year is limited.

Debian had a LOT more users than cero ever will before it got to where
it had a group np running it.

By all means, we should do a stable release soon, but what happens after?

** The wndr3800 is obsolete and fixing the next generation soon would be good

the chipset it is based on keeps going strong, and so far we've been
able to find versions of it on amazon pretty regularly.

But this past christmas everything was 802.11ac, running on arm. The
ath10k is out and getting some love, too.

** Finishing the new SQM stuff
*** emulate bfifo DSLAM, CMTS
I have code for this just never have pushed it out
*** emulate busted wondershaper
I was annoyed enough at wondershaper to write a newer version of it
*** Push to openwrt of SQM and other patches

in order to replace the aqm-scripts it need to do packet classification better.

* Fit and finish issues that would be good to have done before a stable release

** BCP38 compliance
Cerowrt does not currently stop unknown rfc1918 addresses from going out ge00.
** Squash incoming diffserv bits
many providers pee on the diffserv bits. It would be good to detect it
and reset to BE incoming packets. (note: IPv6 is far less peed on.).
There was a nice idea discussed last year on using conntrack to match
incoming with outgoing diffserv bits.

** SSL support for the configuration interfaces
All the plumbing exists for this in cero, it just has to be made to
work. the key generation routine needs to be fixed in uci-defaults and
lighttpd config updated. It's embarrassing to not have SSL running.

** On board documentation updates
still has a lot of obsolete information on it.
I just updated the credits file.

* Bufferbloat.net problems
the bufferbloat.net servers are undermaintained and obsolete. I long
ago swapped out my sysadmin and ruby skills for other things.

** huchra replacement (one disk currently crashed, the other going)
In addition to running this mailing list this used to be 1/5th of the
openwrt build cluster.

lists needs to move to a virtual server ASAP.

openwrt could really use a good build cluster. been running most of
theirs now for a couple years, out of machines pulled from the junk
bin.

** Web Site updates
the redmine implementation on bufferbloat.net has been overrrun by
spam and I stopped
accepting new contributors that didn't contact me also via email
long ago.

given how hard it would be to update the present website, perhaps
moving to cerowrt.org
on a virtual server will be simpler.
--
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
Christopher Robin
2014-01-14 08:37:59 UTC
Permalink
Thanks for these notes. As a user who's been frustrated in trying to
understand the state of CeroWrt and find a way to contribute, I find
this very helpful. I'm not sure what to make of the following though.
Post by Dave Taht
** What is CeroWrt?
Originally intended to prove out a bunch of AQM and scheduling ideas,
it's done that. We proved dnssec was feasible, and simon kelly is
doing that. ISC and openwrt got signed updates working recently, the
only major update-in-the-field problem for openwrt is on updating
kernels.
CeroWrt is ALSO useful for day-to-day use, presently.
If CeroWrt has fulfilled it's original intentions, where does that
leave us now? What improvements is CeroWrt currently working on that
OpenWrt lacks? What's the end game?

I haven't been here long, but it seems to me that CeroWrt should avoid
being a distribution and instead stick to being a proof-of-concept
project. "Going stable" shouldn't mean having a release with bug fixes
that's ready for a production environment, it should mean having the
code tested to a point where it can be pushed upstream to OpenWrt to
implement into their releases. It should be about setting a new "close
enough" baseline to get testers/users to help stress test the new
code.

But I'm new here, and I don't fully understand the workflows and
ideologies involved. Maybe having a stable release is required to push
CeroWrt improvements upstream. Or maybe that's not what you guys are
aiming for.

Some questions that may help provide a better scope for the project:

How many users have CeroWrt running in a production environment (as
the primary router in a business)?
How many users have CeroWrt running as a primary or only router at home?
Is it a goal of this group to provide a CeroWrt build for businesses
to run as their /only/ edge router on and expect 24/7 uptime?
Is it a goal of this group to provide a CeroWrt build easy enough for
the average end user (grandma) to run on their only router?
....

Hrm, I'm rereading all the above and having difficulty liking it for
some reason so let me sum up.

***Are we here for research and development, or are we here for final
implementation?

If we're here for R&D then our "stable" build should be what most
distributions would consider as a beta. Something like we're 99%
certain it won't brick your router and 80-95% certain it won't be
unusable.

If we're trying to be a distribution for end users, we should really
look at expanding the number of routers we support.
Toke Høiland-Jørgensen
2014-01-14 09:44:07 UTC
Permalink
Post by Christopher Robin
***Are we here for research and development, or are we here for final
implementation?
I've always thought about CeroWRT as an R&D project. As Dave points out
I don't think it's realistic to provide a "stable" release in the sense
of having it upgraded and maintained. At least not as things stand now.
However, designating a release as "stable" in the same way as the
previous one (i.e. something that won't crash and where most or all of
the advertised features (mostly) work) would probably be a good idea.
In particular, crash bugs and things that are completely broken should
probably be fixed?


As far as my installation goes:

# cat /sys/kernel/debug/mips/unaligned_instructions
154737
# uptime
10:39:18 up 5 days, 10:56, load average: 0.05, 0.03, 0.04
# dmesg | grep "TX DMA"
[348064.371093] ath: phy0: Failed to stop TX DMA, queues=0x004!
# dmesg | grep "checksum failed"
[13551.957031] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[16072.535156] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[22734.054687] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[93252.820312] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[96253.570312] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[106396.003906] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[156808.253906] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[163650.000000] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[224205.101562] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[269216.191406] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[276718.035156] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[316807.695312] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[329890.929687] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[333792.148437] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[399208.269531] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[410070.828125] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[435757.078125] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[441458.539062] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[449560.417968] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]


I've had to re-initialise the wifi a couple of times for no apparent
reason, and one or two reboots necessary, but nothing that major...

-Toke
David Personette
2014-01-14 12:51:24 UTC
Permalink
I agree with Toke on this, cerowrt with a single supported router was never
about mass adoption. I think everyone using it is in the self selected
group of people that knew enough about networking to find why their
internet connection was *breaking* for interactive use, then go out and buy
a router that cost 2x-3x what other similar specification consumer units
cost. As far as I recall, initial installation required TFTP. Not a real
hurdle for many of us, but quite a barrier to the normal consumer. I've
been using it for my primary router for over a year now, and have been very
happy with it's stability and reliability. I've had to roll back a few
builds, but no real issues otherwise. People that are here, are here to be
where all the new development of consumer level implementations of internet
protocols and things getting fixed is happening. My 2 cents.
--
David P.
Post by Toke Høiland-Jørgensen
Post by Christopher Robin
***Are we here for research and development, or are we here for final
implementation?
I've always thought about CeroWRT as an R&D project. As Dave points out
I don't think it's realistic to provide a "stable" release in the sense
of having it upgraded and maintained. At least not as things stand now.
However, designating a release as "stable" in the same way as the
previous one (i.e. something that won't crash and where most or all of
the advertised features (mostly) work) would probably be a good idea.
In particular, crash bugs and things that are completely broken should
probably be fixed?
# cat /sys/kernel/debug/mips/unaligned_instructions
154737
# uptime
10:39:18 up 5 days, 10:56, load average: 0.05, 0.03, 0.04
# dmesg | grep "TX DMA"
[348064.371093] ath: phy0: Failed to stop TX DMA, queues=0x004!
# dmesg | grep "checksum failed"
[13551.957031] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[16072.535156] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[22734.054687] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[93252.820312] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[96253.570312] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[106396.003906] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[156808.253906] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[163650.000000] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[224205.101562] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[269216.191406] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[276718.035156] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[316807.695312] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[329890.929687] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[333792.148437] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[399208.269531] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[410070.828125] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[435757.078125] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[441458.539062] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[449560.417968] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
I've had to re-initialise the wifi a couple of times for no apparent
reason, and one or two reboots necessary, but nothing that major...
-Toke
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Rich Brown
2014-01-14 13:20:12 UTC
Permalink
Since I kicked off this thread, let me second what David and Toke have said.

I used the wrong word - "stable" - when I really wanted a new stake in the ground. Our first was CeroWrt 3.7.5-2 - it was great. I used it for a long time before these newer builds got even better and I was willing to risk family ire. (So far, so good with 3.10.24-8).

To continue to attract attention, I'd love to be able to post news about 3.10 on the main page of the Bufferbloat site. This would give a signal to technically savvy people that we're alive and kicking and making good things. (And many thanks for the outpouring of love and offers to help that have come in from some of the new members!)

We're still a research project. (Nobody has time for World Domination :-) A stable release with 1-2 year maintenance, etc. is *way* beyond our grasp. But I was hoping for another teaser build that addresses the worst of the problem that Dave identified.

Best,

Rich

Obligatory performance stats for 3.10.24-8. IPv4 only for the moment on my WNDR3700v2. I had to reset one of my Wifi interfaces the other day.

***@cerowrt:~# uptime
07:57:57 up 7 days, 20:04, load average: 0.00, 0.01, 0.04
***@cerowrt:~# cat /sys/kernel/debug/mips/unaligned_instructions
25561
***@cerowrt:~# dmesg | grep "TX DMA"
[114502.492187] ath: phy0: Failed to stop TX DMA, queues=0x084!
[114504.027343] ath: phy0: Failed to stop TX DMA, queues=0x006!
***@cerowrt:~# dmesg | grep "checksum failed"
***@cerowrt:~# dmesg | tail -5
[559339.007812] gw01: Trigger new scan to find an IBSS to join
[559342.328125] gw01: Trigger new scan to find an IBSS to join
[559344.812500] gw01: Trigger new scan to find an IBSS to join
[559344.847656] gw01: Creating new IBSS network, BSSID 32:96:29:8f:34:d8
[559344.855468] IPv6: ADDRCONF(NETDEV_CHANGE): gw01: link becomes ready
I agree with Toke on this, cerowrt with a single supported router was never about mass adoption. I think everyone using it is in the self selected group of people that knew enough about networking to find why their internet connection was *breaking* for interactive use, then go out and buy a router that cost 2x-3x what other similar specification consumer units cost. As far as I recall, initial installation required TFTP. Not a real hurdle for many of us, but quite a barrier to the normal consumer. I've been using it for my primary router for over a year now, and have been very happy with it's stability and reliability. I've had to roll back a few builds, but no real issues otherwise. People that are here, are here to be where all the new development of consumer level implementations of internet protocols and things getting fixed is happening. My 2 cents.
--
David P.
Post by Christopher Robin
***Are we here for research and development, or are we here for final
implementation?
I've always thought about CeroWRT as an R&D project. As Dave points out
I don't think it's realistic to provide a "stable" release in the sense
of having it upgraded and maintained. At least not as things stand now.
However, designating a release as "stable" in the same way as the
previous one (i.e. something that won't crash and where most or all of
the advertised features (mostly) work) would probably be a good idea.
In particular, crash bugs and things that are completely broken should
probably be fixed?
# cat /sys/kernel/debug/mips/unaligned_instructions
154737
# uptime
10:39:18 up 5 days, 10:56, load average: 0.05, 0.03, 0.04
# dmesg | grep "TX DMA"
[348064.371093] ath: phy0: Failed to stop TX DMA, queues=0x004!
# dmesg | grep "checksum failed"
[13551.957031] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[16072.535156] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[22734.054687] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[93252.820312] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[96253.570312] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[106396.003906] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[156808.253906] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[163650.000000] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[224205.101562] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[269216.191406] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[276718.035156] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[316807.695312] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[329890.929687] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[333792.148437] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[399208.269531] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[410070.828125] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[435757.078125] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[441458.539062] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[449560.417968] ICMPv6 checksum failed [2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 > 2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
I've had to re-initialise the wifi a couple of times for no apparent
reason, and one or two reboots necessary, but nothing that major...
-Toke
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
Dave Taht
2014-01-14 15:30:13 UTC
Permalink
Post by Rich Brown
Since I kicked off this thread, let me second what David and Toke have said.
I used the wrong word - "stable" - when I really wanted a new stake in the
ground. Our first was CeroWrt 3.7.5-2 - it was great. I used it for a long
time before these newer builds got even better and I was willing to risk
family ire. (So far, so good with 3.10.24-8).
"A new stake in the ground". I like it.

We need to put that new stake in the ground and then go off to improve wifi!
Post by Rich Brown
To continue to attract attention, I'd love to be able to post news about
3.10 on the main page of the Bufferbloat site. This would give a signal to
technically savvy people that we're alive and kicking and making good
things. (And many thanks for the outpouring of love and offers to help that
have come in from some of the new members!)
In general my attempts at a "stabler" release have been keyed around
ietf conferences, which this year is march 2-7 in london.

http://www.ietf.org/meeting/89/index.html

It's now mid-january. So if we aim for early feburary that would be good.

Most of the time prior to this we've been presenting research (cheshire and
I may do a preso on ECN), but *THIS TIME* it's time to propose new standards.

So... I have been taking a thwack at updating several existing and
writing several new rfcs.

Very rough drafts are at

https://github.com/dtaht/bufferbloat-rfcs

and

https://github.com/dtaht/twd/blob/master/rfc/middle.mkd

"TWD" (naming still in progress) is essentially rrul v2. I took off
from cero the past
couple weekends and (with sean connor) got most of the truly gnarly C
bits written.

co-authors and reviewers welcomed!

Please note that I write in outlines and in bits and bursts randomly
until somehow at the
end a document emerges.
Post by Rich Brown
We're still a research project. (Nobody has time for World Domination :-)
Oh, well, DOCSIS 3.0 got the pie engineering change order a few weeks back.
So some form of aqm will be on the modems starting late next year probably.
No news on fixing the CMTSes of late.

And I do find tales like this inspiring:

http://www.forbes.com/sites/markrogowsky/2014/01/14/5-reasons-nest-sold-to-google/
Post by Rich Brown
A
stable release with 1-2 year maintenance, etc. is *way* beyond our grasp.
But I was hoping for another teaser build that addresses the worst of the
problem that Dave identified.
OK. There are still 140+ bugs to review.
Post by Rich Brown
Best,
Rich
Obligatory performance stats for 3.10.24-8. IPv4 only for the moment on my
WNDR3700v2. I had to reset one of my Wifi interfaces the other day.
07:57:57 up 7 days, 20:04, load average: 0.00, 0.01, 0.04
25561
[114502.492187] ath: phy0: Failed to stop TX DMA, queues=0x084!
[114504.027343] ath: phy0: Failed to stop TX DMA, queues=0x006!
[559339.007812] gw01: Trigger new scan to find an IBSS to join
[559342.328125] gw01: Trigger new scan to find an IBSS to join
[559344.812500] gw01: Trigger new scan to find an IBSS to join
[559344.847656] gw01: Creating new IBSS network, BSSID 32:96:29:8f:34:d8
[559344.855468] IPv6: ADDRCONF(NETDEV_CHANGE): gw01: link becomes ready
I agree with Toke on this, cerowrt with a single supported router was never
about mass adoption. I think everyone using it is in the self selected group
of people that knew enough about networking to find why their internet
connection was *breaking* for interactive use, then go out and buy a router
that cost 2x-3x what other similar specification consumer units cost. As far
as I recall, initial installation required TFTP. Not a real hurdle for many
of us, but quite a barrier to the normal consumer. I've been using it for my
primary router for over a year now, and have been very happy with it's
stability and reliability. I've had to roll back a few builds, but no real
issues otherwise. People that are here, are here to be where all the new
development of consumer level implementations of internet protocols and
things getting fixed is happening. My 2 cents.
--
David P.
Post by Toke Høiland-Jørgensen
Post by Christopher Robin
***Are we here for research and development, or are we here for final
implementation?
I've always thought about CeroWRT as an R&D project. As Dave points out
I don't think it's realistic to provide a "stable" release in the sense
of having it upgraded and maintained. At least not as things stand now.
However, designating a release as "stable" in the same way as the
previous one (i.e. something that won't crash and where most or all of
the advertised features (mostly) work) would probably be a good idea.
In particular, crash bugs and things that are completely broken should
probably be fixed?
# cat /sys/kernel/debug/mips/unaligned_instructions
154737
# uptime
10:39:18 up 5 days, 10:56, load average: 0.05, 0.03, 0.04
# dmesg | grep "TX DMA"
[348064.371093] ath: phy0: Failed to stop TX DMA, queues=0x004!
# dmesg | grep "checksum failed"
[13551.957031] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[16072.535156] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[22734.054687] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[93252.820312] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[96253.570312] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[106396.003906] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[156808.253906] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[163650.000000] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[224205.101562] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[269216.191406] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[276718.035156] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[316807.695312] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[329890.929687] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[333792.148437] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[399208.269531] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[410070.828125] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[435757.078125] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[441458.539062] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
[449560.417968] ICMPv6 checksum failed
[2001:xxxx:xxxx:xxxx:0000:0000:0000:0001 >
2001:xxxx:xxxx:xxxx:0000:0000:0000:0002]
I've had to re-initialise the wifi a couple of times for no apparent
reason, and one or two reboots necessary, but nothing that major...
-Toke
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
--
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
Theodore Ts'o
2014-01-16 03:20:29 UTC
Permalink
Post by Rich Brown
Since I kicked off this thread, let me second what David and Toke have said.
I used the wrong word - "stable" - when I really wanted a new stake
in the ground. Our first was CeroWrt 3.7.5-2 - it was great. I used
it for a long time before these newer builds got even better and I
was willing to risk family ire. (So far, so good with 3.10.24-8).
As I suggested earlier, what would be useful is making the distinction
between "unstable" and "testing". On an internal misc- mailing list
at my company, the topic of more secure/reliable wireless AP's came
up, and I recommended Cerowrt. The response I got back was, roughly
paraphrased, "I was looking on the mailing list, and it seems that
every couple of releases there's some catastrophic breakage; it's too
exciting for me".

Without actually suggesting a commitment of two years worth of
security bug fixes, just something of the form, "our dedicated
unstable testers have been using it for the last week, and it it's
probably not likely to eat your children" (well, your children's web
sites, anyway).

- Ted
Dave Taht
2014-01-15 15:18:35 UTC
Permalink
Post by Christopher Robin
Thanks for these notes. As a user who's been frustrated in trying to
understand the state of CeroWrt and find a way to contribute, I find
this very helpful. I'm not sure what to make of the following though.
Welcome aboard! You might get a better feel for the development via
visiting the #bufferbloat channel on freenode, although of late I've
not been there. I will reform.
Post by Christopher Robin
Post by Dave Taht
** What is CeroWrt?
Originally intended to prove out a bunch of AQM and scheduling ideas,
it's done that. We proved dnssec was feasible, and simon kelly is
doing that. ISC and openwrt got signed updates working recently, the
only major update-in-the-field problem for openwrt is on updating
kernels.
CeroWrt is ALSO useful for day-to-day use, presently.
If CeroWrt has fulfilled it's original intentions, where does that
leave us now?
Well, some core things are still WIP. I forgot to mention that
stuart cheshire is also working on adding this to mdnsresponder

http://tools.ietf.org/html/draft-cheshire-mdnsext-hybrid-01

which would let us either improve avahi or switch to
mdnsresponder.
Post by Christopher Robin
What improvements is CeroWrt currently working on that
OpenWrt lacks?
At the moment "CeroWrt" isn't really working on much that
openwrt lacks, there are maybe 6 patches and a package
that haven't been pushed up yet. There is heavy development
on some core packages (like dnsmasq) going on that we've
been testing...

for example I don't think that nfq_codel or efq_codel will
ever make it up to openwrt, the benefits are too marginal.
Post by Christopher Robin
What's the end game?
Don't think there is an end game. :(

"Proving that 'stuff' can be done better on home routers".
Post by Christopher Robin
I haven't been here long, but it seems to me that CeroWrt should avoid
being a distribution and instead stick to being a proof-of-concept
project.
There have always been these two conflicting pulls. I LIKE the testing
we get from those that actually use it day to day, and the number of
cool people that do so is very scary.
Post by Christopher Robin
"Going stable" shouldn't mean having a release with bug fixes
that's ready for a production environment, it should mean having the
code tested to a point where it can be pushed upstream to OpenWrt to
implement into their releases. It should be about setting a new "close
enough" baseline to get testers/users to help stress test the new
code.
Agree. So I don't want to use the word "stable", but something else.
Post by Christopher Robin
But I'm new here, and I don't fully understand the workflows and
ideologies involved. Maybe having a stable release is required to push
If "doing correct science" is an ideology, then that's mine. Out of that falls
having full access to all source code.
Post by Christopher Robin
CeroWrt improvements upstream. Or maybe that's not what you guys are
aiming for.
A month or so of testing and getting valid results is what I feel is needed
to get stuff upstream. I'm not happy with the results aaron has been getting,
and am thinking of reverting two patches in the upcoming 3.10.26-2
Post by Christopher Robin
How many users have CeroWrt running in a production environment (as
the primary router in a business)?
Don't know. More than 2.
Post by Christopher Robin
How many users have CeroWrt running as a primary or only router at home?
More than a few hundred.
Post by Christopher Robin
Is it a goal of this group to provide a CeroWrt build for businesses
to run as their /only/ edge router on and expect 24/7 uptime?
Some versions of cero have been stable enough to stay up 9+ months.
This was one of my favorite bug reports EVER:

http://www.bufferbloat.net/issues/330

certainly in announcing a "stabler" release I'd like something that
stays up under heavy load for a month or more.
Post by Christopher Robin
Is it a goal of this group to provide a CeroWrt build easy enough for
the average end user (grandma) to run on their only router?
Nope.
Post by Christopher Robin
....
Hrm, I'm rereading all the above and having difficulty liking it for
some reason so let me sum up.
***Are we here for research and development, or are we here for final
implementation?
R&D here.

I'm here mostly to prove out some new ideas in queue theory. I
do get a kick (everyone else here gets a kick, too, I think) out of battlng
cisco and the other big router vendors to their knees, and get them and
the ISPs to fix their darn equipment to work better with interactive
applications.

That process has been working out better and better over the past year.

I ALSO like having a router I can *trust* to be free of security flaws
- at least
as far as I and y'all can make it.

I also fiddle with mesh networking a bit. One of these days I'll have a
congestion aware routing metric that makes sense...

Everybody else has their own reasons.
Post by Christopher Robin
If we're here for R&D then our "stable" build should be what most
distributions would consider as a beta. Something like we're 99%
certain it won't brick your router and 80-95% certain it won't be
unusable.
If we're trying to be a distribution for end users, we should really
look at expanding the number of routers we support.
I'd like to add some 802.11ac chipset, but *all the firmware*
needs to be available in source form.

The ath10k comes closest, but sadly all
of the new arm boards have got blobs for the ethernet drivers.

Currently. Many folk involved here have been campaigning
behind the scenes to get another blob-free router "out there",
and it feels like we are making progress.

maybe the new linksys product won't be a chimara, or mindspeed
will come through, or some other vendor will get a clue.
Post by Christopher Robin
_______________________________________________
Cerowrt-devel mailing list
https://lists.bufferbloat.net/listinfo/cerowrt-devel
--
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
Juergen Botz
2014-01-14 12:10:23 UTC
Permalink
Post by Dave Taht
** src/dst routing via babels
In the last (3.10.24 dev release I switched to babels from quagga.
Either nobody but me uses babel (?), or it "just worked".
I do use it, and it didn't "just work" for me... my second router
didn't pick up the default route and until reading this just now I
was still wondering why /etc/quagga disappeared. On my primary
/etc/quagga still exists because I did a "keep config" upgrade,
which obscured matters further because I was looking at the
wrong config.

What should I change on the primary? /etc/babeld.conf or
/etc/config/babeld?

:j
Dave Taht
2014-01-14 15:10:04 UTC
Permalink
Post by Juergen Botz
Post by Dave Taht
** src/dst routing via babels
In the last (3.10.24 dev release I switched to babels from quagga.
Either nobody but me uses babel (?), or it "just worked".
I do use it, and it didn't "just work" for me... my second router
didn't pick up the default route and until reading this just now I
was still wondering why /etc/quagga disappeared.
I had adopted quagga in the hope of being compatible with
homenet's OSPF direction and being able to use other protocols.

That stuff hasn't made it to quagga yet.

You can, if you prefer, install quagga and the various daemons
with

opkg update
opkg remove babeld
opkg install quagga-vtysh quagga-yourprotocolofchoice

babeld is smaller, and more bleeding edge. I keep hoping
one day to be able to feed per-flow congestion information into it
as a metric....
Post by Juergen Botz
On my primary
/etc/quagga still exists because I did a "keep config" upgrade,
which obscured matters further because I was looking at the
wrong config.
Apologies!

Try, during the development process, to never keep a config.
Post by Juergen Botz
What should I change on the primary? /etc/babeld.conf or
/etc/config/babeld?
On your primary router edit:

/etc/config/babeld

and see the first config filter and change it to ignore false

We don't distribute a default route by default (on quagga either!)
because that would lead to routing loops. So distributing a default
route should only happen on gateways to the internet.

The babels code lets you have multiple gateways to the
internet (over ipv6 and natted ipv4), so you can easily have
a network with multiple exit points and still have
it work with real, rather than natted, IPs.

http://tools.ietf.org/html/draft-baker-rtgwg-src-dst-routing-use-cases-00

http://tools.ietf.org/html/draft-boutier-homenet-source-specific-routing-00

Support for the src/dst stuff is limited to /etc/babeld.conf at the moment.
Post by Juergen Botz
:j
--
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
David Personette
2014-01-14 12:36:34 UTC
Permalink
Post by Dave Taht
** Instruction traps
The instruction trap problem has resurfaced on boot. It is unknown
what triggers it. It doesn't happen very much after boot in my limited
testing.
The last time it bit me was on doing tests on a busy ipv6-enabled
network where it thoroughly blew up the tests. (even when not doing
ipv6 itself) It also made cerowrt unreliable.
7884
What values do you see, both on boot and after some uptime?
http://www.bufferbloat.net/issues/419
I updated to the 3.10.26-1 build, so not much uptime. Also, I don't have my
IPv6 hurricane tunnel active ATM.

***@outpost:~# uname -a
Linux outpost 3.10.26 #1 Sun Jan 12 14:50:55 PST 2014 mips GNU/Linux
***@outpost:~# cat /sys/kernel/debug/mips/unaligned_instructions
0
***@outpost:~# uptime
06:49:16 up 20:28, load average: 0.00, 0.03, 0.04
Post by Dave Taht
** BCP38 compliance
Cerowrt does not currently stop unknown rfc1918 addresses from going out ge00.
** Squash incoming diffserv bits
many providers pee on the diffserv bits. It would be good to detect it
and reset to BE incoming packets. (note: IPv6 is far less peed on.).
There was a nice idea discussed last year on using conntrack to match
incoming with outgoing diffserv bits.
I'd added this into my /etc/firewall.user. I'd be happy to work on adding
it into the official script if you would like. I'm a sysadmin, what
development skills I have are in scripting.
Post by Dave Taht
** SSL support for the configuration interfaces
All the plumbing exists for this in cero, it just has to be made to
work. the key generation routine needs to be fixed in uci-defaults and
lighttpd config updated. It's embarrassing to not have SSL running.
If it's scripting and web server config, I'll work on this too.
Post by Dave Taht
* Bufferbloat.net problems
the bufferbloat.net servers are undermaintained and obsolete. I long
ago swapped out my sysadmin and ruby skills for other things.
** huchra replacement (one disk currently crashed, the other going)
In addition to running this mailing list this used to be 1/5th of the
openwrt build cluster.
lists needs to move to a virtual server ASAP.
openwrt could really use a good build cluster. been running most of
theirs now for a couple years, out of machines pulled from the junk
bin.
** Web Site updates
the redmine implementation on bufferbloat.net has been overrrun by
spam and I stopped
accepting new contributors that didn't contact me also via email
long ago.
given how hard it would be to update the present website, perhaps
moving to cerowrt.org
on a virtual server will be simpler.
This I can work on now, if you like, I can spin up a Digital Ocean VM that
should be able to run a mailing list with no problems. Getting Postfix
setup should be a snap, I'm not sure what else is needed for the mailing
list, but we can discuss it off the mailing list. Did you want a new name
or keep huchra for the VM? Once it's up, getting a list of needed software
from huchra, certs, and the data can be synced over, do some testing, then
the DNS A and MX records can be updated.

Hmm, just saw that Digital Ocean still doesn't have IPv6 yet. Will that be
a problem? Any other suggestions for hosting it? I've used them for several
little projects, they have a good interface and rates, IMHO. Thanks.
--
David P.
Dave Taht
2014-01-15 04:11:02 UTC
Permalink
Post by David Personette
Post by Dave Taht
** Instruction traps
The instruction trap problem has resurfaced on boot. It is unknown
what triggers it. It doesn't happen very much after boot in my limited
testing.
The last time it bit me was on doing tests on a busy ipv6-enabled
network where it thoroughly blew up the tests. (even when not doing
ipv6 itself) It also made cerowrt unreliable.
7884
What values do you see, both on boot and after some uptime?
http://www.bufferbloat.net/issues/419
I updated to the 3.10.26-1 build, so not much uptime. Also, I don't have my
IPv6 hurricane tunnel active ATM.
The bug cropped up with native ipv6 on the wire, notably on the ge00
interface.
Post by David Personette
Linux outpost 3.10.26 #1 Sun Jan 12 14:50:55 PST 2014 mips GNU/Linux
0
Since I get 7k+ errors on boot with 3.10.24-8, I think .26 is looking
like an improvement in this respect.

there is btw, a fix for setting mcast_rates in 3.10.26-1. I have long thought
that we should *default* to a higher rate (9mbits for 2.4ghz and 12 for
5 ghz) as freifunk does for each ssid, for multicast. This will knock weak
signal'd devices off the network entirely, but minimize the impact of
multicast on everyone else.

" hostapd: fix mcast_rate setting

Introduced by ("netifd: add wireless configuration support and
port mac80211 to
the new framework")

Reported-by: René van Weert <***@sowifi.com>
Signed-off-by: Antonio Quartulli <***@meshcoding.com>
"

to test that, add a option mcast_rate 9000 under each wifi-iface stanza
in /etc/config/wireless.

You can do things like fiddle with mdns-scan on your client to see if that
works better/faster or worse.
Post by David Personette
06:49:16 up 20:28, load average: 0.00, 0.03, 0.04
I am encouraged.
Post by David Personette
Post by Dave Taht
** BCP38 compliance
Cerowrt does not currently stop unknown rfc1918 addresses from going out ge00.
** Squash incoming diffserv bits
many providers pee on the diffserv bits. It would be good to detect it
and reset to BE incoming packets. (note: IPv6 is far less peed on.).
There was a nice idea discussed last year on using conntrack to match
incoming with outgoing diffserv bits.
I'd added this into my /etc/firewall.user. I'd be happy to work on adding it
into the official script if you would like. I'm a sysadmin, what development
skills I have are in scripting.
Which part? is it possible to squash just the diffserv and not the ecn bits?

iptables and ipv6tables lines appreciated.

Also additional suggestions for firewall rules appreciated.

One thing I added to simple.qos in the last go round was deprioritizing
icmp a bit post-wondershaper-rant.

# ICMP traffic - Don't impress your friends. Deoptimize to manage ping floods
# better instead

$TC filter add dev $IFACE parent 1:0 protocol ip prio 8 \
u32 match ip protocol 1 0xff flowid 1:13

$TC filter add dev $IFACE parent 1:0 protocol ipv6 prio 9 \
u32 match ip protocol 1 0xff flowid 1:13
Post by David Personette
Post by Dave Taht
** SSL support for the configuration interfaces
All the plumbing exists for this in cero, it just has to be made to
work. the key generation routine needs to be fixed in uci-defaults and
lighttpd config updated. It's embarrassing to not have SSL running.
If it's scripting and web server config, I'll work on this too.
GOFERIT! The cert generation is just plain wrong for lighttpd...
controlled by this
file... (which vanishes after boot)

https://github.com/dtaht/cerofiles-next/blob/cerowrt-next/files/etc/uci-defaults/make-certs.sh

and could use to get re-run out of cron or something after sufficient
randomness has been generated
to ensure a decent cert.

and lightttpd doesn't seem to like the generated cert or trying to
listen with ssl enabled on
https://cerowrt.local:81 (yes, I'd like to keep using a weird port
number for the admin
interface)

I don't mind shipping openssl rather than pxg if that's what's needed
to generate a valid
cert.
Post by David Personette
Post by Dave Taht
* Bufferbloat.net problems
the bufferbloat.net servers are undermaintained and obsolete. I long
ago swapped out my sysadmin and ruby skills for other things.
** huchra replacement (one disk currently crashed, the other going)
In addition to running this mailing list this used to be 1/5th of the
openwrt build cluster.
lists needs to move to a virtual server ASAP.
openwrt could really use a good build cluster. been running most of
theirs now for a couple years, out of machines pulled from the junk
bin.
** Web Site updates
the redmine implementation on bufferbloat.net has been overrrun by
spam and I stopped
accepting new contributors that didn't contact me also via email
long ago.
given how hard it would be to update the present website, perhaps
moving to cerowrt.org
on a virtual server will be simpler.
This I can work on now, if you like, I can spin up a Digital Ocean VM that
should be able to run a mailing list with no problems. Getting Postfix setup
should be a snap, I'm not sure what else is needed for the mailing list, but
we can discuss it off the mailing list.
I can free up some time to work on this fairly soon. Let me know when
you can set aside a few hours (I am presently on EST)
Post by David Personette
Did you want a new name or keep
huchra for the VM? Once it's up, getting a list of needed software from
huchra, certs, and the data can be synced over, do some testing, then the
DNS A and MX records can be updated.
Oh, that would be nice. We haven't had anybody caring for the servers since
Richard Pitt left us...

The other big problem is updating from an ancient version of redmine +
postgres to chilliproject + postgres.
Post by David Personette
Hmm, just saw that Digital Ocean still doesn't have IPv6 yet. Will that be a
problem? Any other suggestions for hosting it? I've used them for several
little projects, they have a good interface and rates, IMHO. Thanks.
Yep, need ipv6. I have a pair of linode instances spun up but have
never done anything with them aside from use them as targets
for rrul tests. One is in NJ, the other in england. Linode seems competent...
Post by David Personette
--
David P.
--
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
Dave Taht
2014-01-15 14:47:15 UTC
Permalink
Post by Dave Taht
there is btw, a fix for setting mcast_rates in 3.10.26-1. I have long thought
that we should *default* to a higher rate (9mbits for 2.4ghz and 12 for
5 ghz) as freifunk does for each ssid, for multicast. This will knock weak
signal'd devices off the network entirely, but minimize the impact of
multicast on everyone else.
" hostapd: fix mcast_rate setting
Introduced by ("netifd: add wireless configuration support and
port mac80211 to
the new framework")
"
to test that, add a option mcast_rate 9000 under each wifi-iface stanza
in /etc/config/wireless.
You can do things like fiddle with mdns-scan on your client to see if that
works better/faster or worse
I've added your recommended "option mcast_rate 9000" to the 2.4 ghz and
"option mcast_rate 12000" to the 5 ghz. I'll reboot the router in a bit
(random other services aren't happy after a network restart; where random ==
I don't remember what breaks).
Post by Dave Taht
Post by David Personette
Post by Dave Taht
** BCP38 compliance
Cerowrt does not currently stop unknown rfc1918 addresses from going
out
ge00.
** Squash incoming diffserv bits
many providers pee on the diffserv bits. It would be good to detect it
and reset to BE incoming packets. (note: IPv6 is far less peed on.).
There was a nice idea discussed last year on using conntrack to match
incoming with outgoing diffserv bits.
I'd added this into my /etc/firewall.user.
I went and looked over the firewall stuff in cero and this stanza was missing.
So it isn't getting called.

# include a file with users custom iptables rules
config include
option path /etc/firewall.user
Post by Dave Taht
I'd be happy to work on
Post by David Personette
adding it
into the official script if you would like. I'm a sysadmin, what development
skills I have are in scripting.
Which part? is it possible to squash just the diffserv and not the ecn bits?
iptables and ipv6tables lines appreciated.
Also additional suggestions for firewall rules appreciated.
One thing I added to simple.qos in the last go round was deprioritizing
icmp a bit post-wondershaper-rant.
# ICMP traffic - Don't impress your friends. Deoptimize to manage ping floods
# better instead
$TC filter add dev $IFACE parent 1:0 protocol ip prio 8 \
u32 match ip protocol 1 0xff flowid 1:13
$TC filter add dev $IFACE parent 1:0 protocol ipv6 prio 9 \
u32 match ip protocol 1 0xff flowid 1:13
Sorry, didn't trim quite enough to designate what I was replying to, I had
meant that I added the rfc1918 egress blocking to my /etc/firewall.user. But
I do have a few lines (that I thought came from this mailing list) for
cleaning up diffserv bits. Although looking at the output of 'iptables -S'
now, I'm not sure that it's even doing what the original author intended.
Instead of three dscp classes (which iptables accepts without throwing an
error), it seems to only keep one of them in the ACCEPT line... it will
probably have to be broken out into multiple lines. If it's even doing what
you were interested in, in the first place.
Please note it's also been a really long time since I touched iptables...
so everything I write below could be wrong. While the rest of you lovable
crazies are putting this stuff on your main gw I'm usually in a lab
with all the firewalls off internally, doing benchmarks....
### Configure both IPv4 and IPv6
ipt() {
iptables $*
ip6tables $*
}
# iptables doesn't support an inverted match
# ipt -t mangle -A PREROUTING -m dscp ! --dscp-class AF11,AF21,BE -j DSCP \
# --set-dscp-class BE
ipt -t mangle -N FIX_TOS
ipt -t mangle -A FIX_TOS -m dscp --dscp-class AF11,AF21,BE -j ACCEPT
I sure wish/hope that works. Otherwise writing 23 firewall rules is kind of hard
on iptables. Definately incomplete otherwise...
ipt -t mangle -A FIX_TOS -j DSCP --set-dscp-class BE
What I see is a ton of CS1 traffic that I don't think originated that way.
I don't mind re-marking CS1 traffic to be BE on entry, and then applying
local rules (trying to match torrent for example)

I think this needs to happen only on exiting the ge00 interface
on entry. We want to not mangle our own diffserv domain.
for i in sw00 sw10 gw00 gw10 gw01 gw11; do
ipt -t mangle -A POSTROUTING -o $i -j FIX_TOS
done
So a change to:

ipt -t mangle -A PREROUTING -i ge00 -j FIX_TOS

?

moving onto bcp38...

http://tools.ietf.org/html/bcp38
# Create a rfc1918 IP filter
iptables -N grey
iptables -A grey -s 172.30.42.0/24 -j ACCEPT
iptables -A grey -s 10.0.0.0/8 -j DROP
iptables -A grey -s 127.0.0.0/8 -j DROP
iptables -A grey -s 172.16.0.0/12 -j DROP
iptables -A grey -s 192.168.0.0/16 -j DROP
iptables -A grey -s 169.254.0.0/16 -j DROP
iptables -I delegate_forward 4 -o ge00 -j grey
1) We have a problem is that bcp38 should not be on by default
in a double nat situation. Or at least be more clever and grok if
it's external interface is a rfc1918 address that it should be allowed
to send to it. Or punt and not enable bcp38 in a double nat situation.

2) Is there BCP38 for ipv6? I know the default rules for hurricane
"do the right thing", but native?

3) At least in my networks I use 172.x subnets heavily

4) And ipset is available. I don't know if the gui supports it or what the
difference in speed would be by traversing this many rules.

Moving onto a different thought on a slightly different topic

I think something like this should be a defined rule in the
/etc/config/firewall so that missing hosts get information back...

(untested)

iptables -N noegress
iptables -A noegress -d 10.0.0.0/8 -j REJECT # with destination
unreachable somehow
iptables -A noegress -d 127.0.0.0/8 -j REJECT
iptables -A noegress -d 172.16.0.0/12 -j REJECT
iptables -A noegress -d 192.168.0.0/16 -j REJECT
iptables -A noegress -d 169.254.0.0/16 -j REJECT
iptables -I delegate_forward 4 -o ge00 -j noegress #here? why this rule?
Post by Dave Taht
Post by David Personette
Post by Dave Taht
** SSL support for the configuration interfaces
All the plumbing exists for this in cero, it just has to be made to
work. the key generation routine needs to be fixed in uci-defaults and
lighttpd config updated. It's embarrassing to not have SSL running.
If it's scripting and web server config, I'll work on this too.
GOFERIT! The cert generation is just plain wrong for lighttpd...
controlled by this
file... (which vanishes after boot)
https://github.com/dtaht/cerofiles-next/blob/cerowrt-next/files/etc/uci-defaults/make-certs.sh
and could use to get re-run out of cron or something after sufficient
randomness has been generated
to ensure a decent cert.
and lightttpd doesn't seem to like the generated cert or trying to
listen with ssl enabled on
https://cerowrt.local:81 (yes, I'd like to keep using a weird port
number for the admin
interface)
I don't mind shipping openssl rather than pxg if that's what's needed
to generate a valid
cert.
I'll start working on it. Thanks.
Post by Dave Taht
Post by David Personette
Post by Dave Taht
* Bufferbloat.net problems
the bufferbloat.net servers are undermaintained and obsolete. I long
ago swapped out my sysadmin and ruby skills for other things.
** huchra replacement (one disk currently crashed, the other going)
In addition to running this mailing list this used to be 1/5th of the
openwrt build cluster.
lists needs to move to a virtual server ASAP.
openwrt could really use a good build cluster. been running most of
theirs now for a couple years, out of machines pulled from the junk
bin.
** Web Site updates
the redmine implementation on bufferbloat.net has been overrrun by
spam and I stopped
accepting new contributors that didn't contact me also via email
long ago.
given how hard it would be to update the present website, perhaps
moving to cerowrt.org
on a virtual server will be simpler.
This I can work on now, if you like, I can spin up a Digital Ocean VM that
should be able to run a mailing list with no problems. Getting Postfix setup
should be a snap, I'm not sure what else is needed for the mailing list, but
we can discuss it off the mailing list.
I can free up some time to work on this fairly soon. Let me know when
you can set aside a few hours (I am presently on EST)
Post by David Personette
Did you want a new name or keep
huchra for the VM? Once it's up, getting a list of needed software from
huchra, certs, and the data can be synced over, do some testing, then the
DNS A and MX records can be updated.
Oh, that would be nice. We haven't had anybody caring for the servers since
Richard Pitt left us...
The other big problem is updating from an ancient version of redmine +
postgres to chilliproject + postgres.
Post by David Personette
Hmm, just saw that Digital Ocean still doesn't have IPv6 yet. Will that be a
problem? Any other suggestions for hosting it? I've used them for several
little projects, they have a good interface and rates, IMHO. Thanks.
Yep, need ipv6. I have a pair of linode instances spun up but have
never done anything with them aside from use them as targets
for rrul tests. One is in NJ, the other in england. Linode seems competent...
I have nothing against Linode's service. The nice things about DigitalOcean
are: all SSD storage, and their rates are about half Linode. Linode offers
more CPUs at each node size however, and has functional IPv6 (which I
thought that they would have straightened out by now *SIGH*).
I'm eastern time too. I'm get off work at 17:30, but can probably work in an
extra meeting during the day if it works out better for you. I'm off next
couple of Mondays so have a 3 day weekends to work on things in. You can
reach me via email, gtalk, "hangouts", and I'll be glad to give you my POTS
numbers on a medium that can't easily be harvested by bots.
--
David P.
--
Dave Täht

Fixing bufferbloat with cerowrt: http://www.teklibre.com/cerowrt/subscribe.html
David Lang
2014-01-15 00:30:42 UTC
Permalink
Post by Dave Taht
I am in strong agreement that cerowrt is close to being ready for a
stable release.
** What is a stable release?
To me a "stable release" is something that has been extensively
tested, benchmarked, and will have a series of updates and security
fixes for 1-2 years. It has a maintainer, a bug database, a means for
dealing with major security issues, and so on.
I don't think we need this, but we do need a newer release that doesn't say
"don't use this on anything you care about"

Even if there is no backporting of patches, a release that we can say "This
seems to work well, here are known bugs" and "we plan to make a release like
this at least once every X months"

This isn't freezing a release and only applying bugfixes to it, it's a rolling,
development process where we can point people at something fairly recent to use
and they can use it with the expectation that by the time it becomes badly
obsolete we will have something newer available for them.

David Lang
Jim Reisert AD1C
2014-01-15 17:31:54 UTC
Permalink
Post by Dave Taht
** Native IPv6 and dhcpv6-pd support
I am very happy with comcast's huge rollout of ipv6 across their
network. (over 25% of their base now) http://www.comcast6.net .
Hearing that cero isn't working on comcast anymore bugs me. I don't
seem to have ipv6 on my directly controlled comcast nodes. Yet.
Setting up a dhcpv6-pd server and some testing is required to make
sure it isn't cerowrt that's busted. I'd like to not go another year
without ipv6.
If you want help testing/debugging this, let me know. I could give
you access to my router if you want to experiment (as long as the
Internet connectivity isn't down for more than a couple of minutes at
a time).

- Jim
--
Jim Reisert AD1C, <***@alum.mit.edu>, http://www.ad1c.us
Maciej Soltysiak
2014-01-20 15:00:29 UTC
Permalink
Post by Dave Taht
** The wndr3800 is obsolete and fixing the next generation soon would be good
On wndr3800 being obsolete... I was asked recently for a router recommendation.
I wanted to say WNDR3800, but do we have anything better that is still
that hackable?
Post by Dave Taht
But this past christmas everything was 802.11ac, running on arm. The
ath10k is out and getting some love, too.
We still don't know the specs of WRT1900AC, do we?

Best regards,
Maciej
Rich Brown
2014-01-20 15:15:26 UTC
Permalink
Post by Maciej Soltysiak
Post by Dave Taht
** The wndr3800 is obsolete and fixing the next generation soon would be good
On wndr3800 being obsolete... I was asked recently for a router recommendation.
I wanted to say WNDR3800, but do we have anything better that is still
that hackable?
The WNDR3800 is still about $100 on Amazon today (in the US). If someone wants to install it this week, it's a good deal. It's eminently hackable, it has an astonishing development team that has really addressed home networking, and they can get the benefit of all that now.
Post by Maciej Soltysiak
Post by Dave Taht
But this past christmas everything was 802.11ac, running on arm. The
ath10k is out and getting some love, too.
We still don't know the specs of WRT1900AC, do we?
I received an initial response from Linksys, who said that they were busy taking care of the details after the CES show. They promised to have someone get back to me who could speak to the technical details.

It's the MLK holiday here in the US, so they are probably not working. I'll ping them tomorrow.
Post by Maciej Soltysiak
Best regards,
Maciej
Rich

Loading...