Discussion:
2.6.19-rc1-mm1
Andrew Morton
2006-10-10 07:09:28 UTC
Permalink
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/


- Added the ext4 filesystem. Quick usage instructions:

- Grab updated e2fsprogs from
ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim/

- It's still mke2fs -j /dev/hda1

- mount /dev/hda1 /wherever -t ext4dev

- To enable extents,

mount /dev/hda1 /wherever -t ext4dev -o extents

- The filesystem is compatible with the ext3 driver until you add a file
which has extents (ie: `mount -o extents', then create a file).

- When comparing performance with other filesystems, remember that
ext3/4 by default offers higher data integrity guarantees than most. So
when comparing with a metadata-only journalling filesystem, use `mount -o
data=writeback'. (Although this doesn't seem to make much difference with
ext3).

And you might as well use `mount -o nobh' too.

Making the journal larger than the mke2fs default often helps
performance with metadata-intensive workloads.

- Added the high-resolution timers and dynamic-ticks code. Please be sure
to cc ***@linutronix.de>, ***@elte.hu and ***@us.ibm.com if it blows
up.



Boilerplate:

- See the `hot-fixes' directory for any important updates to this patchset.

- To fetch an -mm tree using git, use (for example)

git fetch git://git.kernel.org/pub/scm/linux/kernel/git/smurf/linux-trees.git v2.6.16-rc2-mm1

- -mm kernel commit activity can be reviewed by subscribing to the
mm-commits mailing list.

echo "subscribe mm-commits" | mail ***@vger.kernel.org

- If you hit a bug in -mm and it is not obvious which patch caused it, it is
most valuable if you can perform a bisection search to identify which patch
introduced the bug. Instructions for this process are at

http://www.zip.com.au/~akpm/linux/patches/stuff/bisecting-mm-trees.txt

But beware that this process takes some time (around ten rebuilds and
reboots), so consider reporting the bug first and if we cannot immediately
identify the faulty patch, then perform the bisection search.

- When reporting bugs, please try to Cc: the relevant maintainer and mailing
list on any email.

- When reporting bugs in this kernel via email, please also rewrite the
email Subject: in some manner to reflect the nature of the bug. Some
developers filter by Subject: when looking for messages to read.

- Semi-daily snapshots of the -mm lineup are uploaded to
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/mm/ and are announced on
the mm-commits list.


Changes since 2.6.18-mm3:


origin.patch
git-acpi.patch
git-cifs.patch
git-dvb.patch
git-geode.patch
git-ia64.patch
git-ieee1394.patch
git-infiniband.patch
git-libata-all.patch
git-mtd.patch
git-netdev-all.patch
git-ocfs2.patch
git-pcmcia.patch
git-selinux.patch
git-pciseg.patch
git-s390.patch
git-sh.patch
git-scsi-target.patch
git-qla3xxx.patch
git-watchdog.patch
git-gccbug.patch

git trees.

-pidh-cleanup.patch
-vfs-make-filldir_t-and-struct-kstat-deal-in-64-bit-inode-numbers.patch
-revert-insert-ioapics-and-local-apic-into-resource-map.patch
-acpi-cast-removal.patch
-dereference-after-free-in-snd_hwdep_release.patch
-kauditd_thread-warning-fix.patch
-hdrcheck-permission-fix.patch
-docs-small-kbuild-cleanup.patch
-kthread-update-arch-mips-kernel-apmc.patch
-mmc-driver-for-ti-flashmedia-card-reader-source.patch
-mmc-driver-for-ti-flashmedia-card-reader-kconfig-makefile.patch
-forcedeth-hardirq-lockdep-warning.patch
-hp100-fix-conditional-compilation-mess.patch
-zatm-always-clear-pcr-in-alloc_shaper.patch
-atm-ambassador-fix-return-code-bug.patch
-tipc-fix-printk-warning.patch
-git-powerpc-wrapper-dont-require-execute-permissions.patch
-powerpc-xmon-fix.patch
-pcie_portdrv_restore_config-undefined-without-config_pm.patch
-pci-optionally-sort-device-lists-breadth-first.patch
-scsi-convertion-to-struct-scsi_cmnd-in-ips-driver.patch
-scsi-scsi_cmnd-convertion-in-arm-subtree.patch
-gregkh-usb-usb-storage-unusual_devs.h-entry-for-sony-ericsson-p990i.patch
-usb-serial-mos7840-fix-cast.patch
-x86_64-mm-defconfig-update.patch
-x86_64-mm-i386-defconfig-update.patch
-x86_64-mm-calgary-init.patch
-x86_64-mm-calgary-off-by-one.patch
-x86_64-mm-calgary-jon-contact.patch
-x86_64-mm-calgary-hex-bus.patch
-x86_64-mm-pci-bios-fix.patch
-x86_64-mm-kernel-stack-termination.patch
-fix-x86_64-mm-kernel-stack-termination.patch
-mm-micro-optimise-zone_watermark_ok.patch
-slab-clean-up-leak-tracking-ifdefs-a-little-bit.patch
-kmemdup-introduce-vs-slab-clean-up-leak-tracking-ifdefs-a-little-bit.patch
-slab-reduce-numa-text-size.patch
-slab-reduce-numa-text-size-tidy.patch
-create-kallsyms_lookup_size_offset.patch
-low-performance-of-lib-sortc.patch
-char-kill-unneeded-memsets.patch
-char-serial167-remove-useless-tty-check.patch
-kernel-doc-for-kernel-dmac.patch
-kernel-doc-for-kernel-resourcec.patch
-fs-eventpoll-error-handling-micro-cleanup.patch
-ipmi-fix-uninitd-data-bug.patch
-drivers-char-ip2-kill-unused-code-label.patch
-schedule-ftape-removal.patch
-isdn-warning-fixes.patch
-restore-parport_pc-probing-on-powermac.patch
-add-pekka-to-credits.patch
-ipmi-allow-user-to-override-the-kernel-ipmi-daemon-enable.patch
-ipmi-allow-user-to-override-the-kernel-ipmi-daemon-enable-tidy.patch
-ia64-note-requirement-for-8250_pnp-now-that-8250_acpi-is-gone.patch
-maintainers-removes-duplicated-entry.patch
-pktcdvd-replace-pktcdvd-strings-with-macro-driver_name.patch
-pktcdvd-rename-a-variable-for-better-readability.patch
-remove-unnecessary-check-in-fs-reiserfs-inodec.patch
-add-unifdef-to-gitignore.patch
-fix-spurious-error-on-tags-target-when-missing-defconfig.patch
-pata_hpt366-fix-typo.patch
-hisax-niccy-cleanup.patch
-knfsd-nfsd-lockdep-annotation-fix.patch
-knfsd-call-lockd_down-when-closing-a-socket-via-a-write-to-nfsd-portlist.patch
-knfsd-protect-update-to-sn_nrthreads-with-lock_kernel.patch
-knfsd-fixed-handling-of-lockd-fail-when-adding-nfsd-socket.patch
-knfsd-replace-two-page-lists-in-struct-svc_rqst-with-one.patch
-knfsd-replace-two-page-lists-in-struct-svc_rqst-with-one-fix.patch
-knfsd-avoid-excess-stack-usage-in-svc_tcp_recvfrom.patch
-knfsd-prepare-knfsd-for-support-of-rsize-wsize-of-up-to-1mb-over-tcp.patch
-knfsd-allow-max-size-of-nfsd-payload-to-be-configured.patch
-knfsd-make-nfsd-readahead-params-cache-smp-friendly.patch
-knfsd-knfsd-cache-ipmap-per-tcp-socket.patch
-knfsd-hide-use-of-lockds-h_monitored-flag.patch
-knfsd-consolidate-common-code-for-statd-lockd-notification.patch
-knfsd-when-looking-up-a-lockd-host-pass-hostname-length.patch
-knfsd-lockd-introduce-nsm_handle.patch
-knfsd-lockd-introduce-nsm_handle-fix.patch
-knfsd-misc-minor-fixes-indentation-changes.patch
-knfsd-lockd-make-nlm_host_rebooted-use-the-nsm_handle.patch
-knfsd-lockd-make-the-nsm-upcalls-use-the-nsm_handle.patch
-knfsd-lockd-make-the-hash-chains-use-a-hlist_node.patch
-knfsd-lockd-change-list-of-blocked-list-to-list_node.patch
-knfsd-change-nlm_file-to-use-a-hlist.patch
-knfsd-lockd-make-nlm_traverse_-more-flexible.patch
-knfsd-lockd-add-nlm_destroy_host.patch
-knfsd-simplify-nlmsvc_invalidate_all.patch
-knfsd-lockd-optionally-use-hostnames-for-identifying-peers.patch
-knfsd-make-nlmclnt_next_cookie-smp-safe.patch
-knfsd-match-granted_res-replies-using-cookies.patch
-knfsd-export-nsm_local_state-to-user-space-via-sysctl.patch
-knfsd-lockd-fix-use-of-h_nextrebind.patch
-knfsd-register-all-rpc-programs-with-portmapper-by-default.patch
-knfsd-lockd-introduce-nsm_handle-sem2mutex.patch
-knfsd-svcrpc-gss-factor-out-some-common-wrapping-code.patch
-knfsd-svcrpc-gss-fix-failure-on-svc_denied-in-integrity-case.patch
-knfsd-svcrpc-use-consistent-variable-name-for-the-reply-state.patch
-knfsd-nfsd4-refactor-exp_pseudoroot.patch
-knfsd-nfsd4-clean-up-exp_pseudoroot.patch
-knfsd-nfsd4-acls-relax-the-nfsv4-posix-mapping.patch
-knfsd-nfsd4-acls-fix-inheritance.patch
-knfsd-nfsd4-acls-simplify-nfs4_acl_nfsv4_to_posix-interface.patch
-knfsd-nfsd4-acls-fix-handling-of-zero-length-acls.patch
-knfsd-lockd-fix-refount-on-nsm.patch
-knfsd-fix-auto-sizing-of-nfsd-request-reply-buffers.patch
-knfsd-close-a-race-opportunity-in-d_splice_alias.patch
-knfsd-nfsd-store-export-path-in-export.patch
-knfsd-nfsd4-fslocations-data-structures.patch
-knfsd-nfsd4-fslocations-data-structures-nfsd4-fix-fs-locations-bounds-checking.patch
-knfsd-nfsd4-fslocations-data-structures-nfsd4-fslocs-fix-compile-in-non-config_nfsd_v4-case.patch
-knfsd-nfsd4-xdr-encoding-for-fs_locations.patch
-knfsd-nfsd4-actually-use-all-the-pieces-to-implement-referrals.patch
-sched-force-sbin-init-off-isolated-cpus.patch
-sched-remove-unnecessary-sched-group-allocations.patch
-sched-dont-print-migration-cost-when-only-1-cpu.patch
-sched-introduce-child-field-in-sched_domain.patch
-sched-cleanup-sched_group-cpu_power-setup.patch
-sched-fixing-wrong-comment-for-find_idlest_cpu.patch
-scheduler-numa-aware-placement-of-sched_group_allnodes.patch
-ecryptfs-fs-makefile-and-fs-kconfig.patch
-ecryptfs-fs-makefile-and-fs-kconfig-kconfig-help-update.patch
-ecryptfs-documentation.patch
-ecryptfs-makefile.patch
-ecryptfs-main-module-functions.patch
-ecryptfs-header-declarations.patch
-ecryptfs-superblock-operations.patch
-ecryptfs-dentry-operations.patch
-ecryptfs-file-operations.patch
-ecryptfs-file-operations-readdir-fix-for-seeking-in-directory-streams.patch
-ecryptfs-inode-operations.patch
-ecryptfs-mmap-operations.patch
-ecryptfs-mmap-operations-fix.patch
-ecryptfs-keystore.patch
-ecryptfs-crypto-functions.patch
-ecryptfs-crypto-functions-mutex-fixes.patch
-fs-ecryptfs-possible-cleanups.patch
-ecryptfs-debug-functions.patch
-ecryptfs-alpha-build-fix.patch
-ecryptfs-convert-assert-to-bug_on.patch
-ecryptfs-remove-pointless-bug_ons.patch
-ecryptfs-remove-unnecessary-null-checks.patch
-ecryptfs-rewrite-ecryptfs_fsync.patch
-ecryptfs-overhaul-file-locking.patch
-ecryptfs-remove-lock-propagation.patch
-ecryptfs-dont-muck-with-the-existing-nameidata-structures.patch
-ecryptfs-asm-scatterlisth-linux-scatterlisth.patch
-ecryptfs-support-for-larger-maximum-key-size.patch
-ecryptfs-add-codes-for-additional-ciphers.patch
-ecryptfs-unencrypted-key-size-based-on-encrypted-key-size.patch
-ecryptfs-packet-and-key-management-update-for-variable-key-size.patch
-ecryptfs-add-ecryptfs_-prefix-to-mount-options-key-size-parameter.patch
-ecryptfs-set-the-key-size-from-the-default-for-the-mount.patch
-ecryptfs-check-for-weak-keys.patch
-ecryptfs-add-define-values-for-cipher-codes-from-rfc2440-openpgp.patch
-ecryptfs-convert-bits-to-bytes.patch
-ecryptfs-more-elegant-aes-key-size-manipulation.patch
-ecryptfs-more-intelligent-use-of-tfm-objects.patch
-ecryptfs-remove-debugging-cruft.patch
-ecryptfs-get_sb_dev-fix.patch
-ecryptfs-validate-minimum-header-extent-size.patch
-ecryptfs-validate-body-size.patch
-ecryptfs-validate-packet-length-prior-to-parsing-add-comments.patch
-ecryptfs-use-the-passed-in-max-value-as-the-upper-bound.patch
-ecryptfs-change-the-maximum-size-check-when-writing-header.patch
-ecryptfs-print-the-actual-option-that-is-problematic.patch
-ecryptfs-add-a-maintainers-entry.patch
-ecryptfs-partial-signed-integer-to-size_t-conversion-updated-ii.patch
-ecryptfs-graceful-handling-of-mount-error.patch
-inode-diet-move-i_pipe-into-a-union-ecryptfs.patch
-inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-ecryptfs.patch
-streamline-generic_file_-interfaces-and-filemap-ecryptfs.patch
-ecryptfs-fix-printk-format-warnings.patch
-ecryptfs-associate-vfsmount-with-dentry-rather-than-superblock.patch
-ecryptfs-mntput-lower-mount-on-umount_begin.patch
-vfs-make-filldir_t-and-struct-kstat-deal-in-64-bit-inode-numbers-ecryptfs.patch
-make-kmem_cache_destroy-return-void-ecryptfs.patch
-ecryptfs-inode-numbering-fixes.patch
-ecryptfs-versioning-fixes.patch
-ecryptfs-versioning-fixes-tidy.patch
-ecryptfs-grab-lock-on-lower_page-in-ecryptfs_sync_page.patch
-ecryptfs-enable-plaintext-passthrough.patch
-non-libata-driver-for-jmicron-devices.patch
-ide-claim-extra-dma-ports-regardless-of-channel.patch
-ide-always-release-dma-engine.patch
-ide-error-handling-fixes.patch
-make-number-of-ide-interfaces-configurable.patch
-ide_dma_speed-fixes.patch
-enable-cdrom-dma-access-with-pdc20265_old.patch
-ide-fix-revision-comparison-in-ide_in_drive_list.patch
-ide-backport-piix-fixes-from-libata-into-the-legacy-driver.patch
-move-ide-to-unmaintained-drop-reference-to-old-git-tree.patch
-ide-core-must_check-fixes.patch
-drivers-ide-cleanups.patch
-ide-remove-dma_base2-field-from-ide_hwif_t.patch
-ide-reprogram-disk-pio-timings-on-resume.patch
-pcmcia-add-few-ids-into-ide-cs.patch
-config_pm=n-slim-drivers-ide-pci-sc1200c.patch
-ide-fix-crash-on-repeated-reset.patch
-ide-fix-crash-on-repeated-reset-tidy.patch
-allow-ide_generic_all-to-be-used-modular-and-built-in.patch
-ide-more-pci_find-cleanup.patch
-ide-cs-compactflash-driver-rm-irq-warning.patch
-au1100fb-add-option-to-enable-disable-the-cursor.patch
-intelfb-documentation-update.patch
-rivafb-use-constants-instead-of-magic-values.patch
-vfb-document-option-to-enable-the-driver.patch
-fbdev-add-generic-ddc-read-functionality.patch
-nvidiafb-use-generic-ddc-reading.patch
-rivafb-use-generic-ddc-reading.patch
-i810fb-use-generic-ddc-reading.patch
-savagefb-use-generic-ddc-reading.patch
-savagefb-use-generic-ddc-reading-fix.patch
-radeonfb-use-generic-ddc-reading.patch
-fbcon-use-persistent-allocation-for-cursor-blinking.patch
-fbcon-remove-cursor-timer-if-unused.patch
-vt-honor-the-return-value-of-device_create_file.patch
-fbdev-honor-the-return-value-of-device_create_file.patch
-fbcon-honor-the-return-value-of-device_create_file.patch
-atyfb-honor-the-return-value-of-pci_register_driver.patch
-matroxfb-honor-the-return-value-of-pci_register_driver.patch
-nvidiafb-honor-the-return-value-of-pci_enable_device.patch
-i810fb-honor-the-return-value-of-pci_enable_device.patch
-drivers-video-sis-init301h-removal-of-old.patch
-drivers-video-sis-initextlfbc-removal-of.patch
-drivers-video-sis-inith-removal-of-old-code.patch
-drivers-video-sis-osdefh-removal-of-old-code.patch
-drivers-video-sis-sis_accelc-removal-of-old.patch
-drivers-video-sis-sis_accelh-removal-of-old.patch
-drivers-video-sis-sis_mainc-removal-of-old.patch
-drivers-video-sis-sis_mainc-removal-of-old-2.patch
-drivers-video-sis-vgatypesh-removal-of-old.patch
-drivers-video-sis-sis_mainh-removal-of-old.patch
-atyfb-possible-cleanups.patch
-mbxfb-fix-a-chip-bug-resulting-in-wrong-pixclock.patch
-mbxfb-fix-framebuffer-size-smaller-than-requested.patch
-fbcon-make-3-functions-static.patch
-vt-proper-prototypes-for-some-console-functions.patch
-sstfb-clean-ups.patch
-documentation-fixes-in-intel810txt.patch
-radeonfb-supend-resume-support-for-acer-aspire-2010.patch
-fbdev-correct-buffer-size-limit-in-fbmem_read_proc.patch
-dm-support-ioctls-on-mapped-devices.patch
-dm-linear-support-ioctls.patch
-dm-mpath-support-ioctls.patch
-dm-export-blkdev_driver_ioctl.patch
-dm-support-ioctls-on-mapped-devices-fix-with-fake-file.patch
-dm-fix-alloc_dev-error-path.patch
-dm-snapshot-fix-invalidation-enomem.patch
-dm-snapshot-allow-zero-chunk_size.patch
-dm-snapshot-fix-metadata-error-handling.patch
-dm-snapshot-make-read-and-write-exception-functions-void.patch
-dm-snapshot-fix-metadata-writing-when-suspending.patch
-dm-snapshot-tidy-snapshot_map.patch
-dm-snapshot-tidy-pending_complete.patch
-dm-snapshot-add-workqueue.patch
-dm-snapshot-tidy-pe-ref-counting.patch
-dm-snapshot-fix-freeing-pending-exception.patch
-dm-mirror-remove-trailing-space-from-table.patch
-dm-mpath-tidy-ctr.patch
-dm-mpath-use-kzalloc.patch
-dm-add-uevent-change-event-on-resume.patch
-dm-add-debug-macro.patch
-dm-table-add-target-preresume.patch
-dm-crypt-add-key-msg.patch
-dm-crypt-restructure-for-workqueue-change.patch
-dm-crypt-restructure-write-processing.patch
-dm-crypt-move-io-to-workqueue.patch
-dm-crypt-use-private-biosets.patch
-dm-use-private-biosets.patch
-dm-extract-device-limit-setting.patch
-dm-table-add-target-flush.patch
-md-the-scheduled-removal-of-the-start_array-ioctl-for-md.patch
-md-fix-a-comment-that-is-wrong-in-raid5h.patch
-md-factor-out-part-of-raid10d-into-a-separate-function.patch
-md-replace-magic-numbers-in-sb_dirty-with-well-defined-bit-flags.patch
-md-remove-the-working_disks-and-failed_disks-from-raid5-state-data.patch
-md-remove-working_disks-from-raid10-state.patch
-md-new-sysfs-interface-for-setting-bits-in-the-write-intent-bitmap.patch
-md-remove-unnecessary-variable-x-in-stripe_to_pdidx.patch
-md-factor-out-part-of-raid1d-into-a-separate-function.patch
-md-remove-working_disks-from-raid1-state-data.patch
-md-improve-locking-around-error-handling.patch
-md-define-backing_dev_infocongested_fn-for-raid0-and-linear.patch
-md-define-congested_fn-for-raid1-raid10-and-multipath.patch
-md-add-a-congested_fn-function-for-raid5-6.patch
-md-make-messages-about-resync-recovery-etc-more-specific.patch
-md-fix-duplicity-of-levels-in-mdtxt.patch
-md-remove-max_md_devs-which-is-an-arbitrary-limit.patch
-md-remove-experimental-classification-from-raid5-reshape.patch
-md-use-ffz-instead-of-find_first_set-to-convert-multiplier-to-shift.patch
-md-allow-set_bitmap_file-to-work-on-64bit-kernel-with-32bit-userspace.patch
-md-add-error-reporting-to-superblock-write-failure.patch
-genirq-convert-the-x86_64-architecture-to-irq-chips.patch
-genirq-convert-the-i386-architecture-to-irq-chips.patch
-genirq-irq-convert-the-move_irq-flag-from-a-32bit-word-to-a-single-bit.patch
-genirq-irq-add-moved_masked_irq.patch
-genirq-x86_64-irq-reenable-migrating-irqs-to-other-cpus.patch
-genirq-msi-simplify-msi-enable-and-disable.patch
-genirq-msi-make-the-msi-boolean-tests-return-either-0-or-1.patch
-genirq-msi-implement-helper-functions-read_msi_msg-and-write_msi_msg.patch
-genirq-msi-refactor-the-msi_ops.patch
-genirq-msi-simplify-the-msi-irq-limit-policy.patch
-genirq-irq-add-a-dynamic-irq-creation-api.patch
-genirq-ia64-irq-dynamic-irq-support.patch
-genirq-i386-irq-dynamic-irq-support.patch
-genirq-x86_64-irq-dynamic-irq-support.patch
-genirq-msi-make-the-msi-code-irq-based-and-not-vector-based.patch
-genirq-x86_64-irq-move-msi-message-composition-into-io_apicc.patch
-genirq-i386-irq-move-msi-message-composition-into-io_apicc.patch
-genirq-msi-only-build-msi-apicc-on-ia64.patch
-genirq-msi-only-build-msi-apicc-on-ia64-fix.patch
-genirq-x86_64-irq-remove-the-msi-assumption-that-irq-==-vector.patch
-genirq-i386-irq-remove-the-msi-assumption-that-irq-==-vector.patch
-genirq-irq-remove-msi-hacks.patch
-genirq-irq-generalize-the-check-for-hardirq_bits.patch
-genirq-x86_64-irq-make-the-external-irq-handlers-report-their-vector-not-the-irq-number.patch
-genirq-x86_64-irq-make-vector_irq-per-cpu.patch
-genirq-x86_64-irq-make-vector_irq-per-cpu-fix.patch
-genirq-x86_64-irq-make-vector_irq-per-cpu-warning-fix.patch
-genirq-x86_64-irq-kill-gsi_irq_sharing.patch
-genirq-x86_64-irq-kill-irq-compression.patch
-add-hypertransport-capability-defines.patch
-add-hypertransport-capability-defines-fix.patch
-initial-generic-hypertransport-interrupt-support.patch
-initial-generic-hypertransport-interrupt-support-Kconfig-fix.patch
-msi-simplify-msi-sanity-checks-by-adding-with-generic-irq-code.patch
-msi-only-use-a-single-irq_chip-for-msi-interrupts.patch
-msi-refactor-and-move-the-msi-irq_chip-into-the-arch-code.patch
-msi-move-the-ia64-code-into-arch-ia64.patch
-htirq-tidy-up-the-htirq-code.patch
-genirq-clean-up-irq-flow-type-naming.patch
-srcu-3-rcu-variant-permitting-read-side-blocking.patch
-srcu-3-rcu-variant-permitting-read-side-blocking-fix.patch
-srcu-3-rcu-variant-permitting-read-side-blocking-srcu-add-lock-annotations.patch
-srcu-3-rcu-variant-permitting-read-side-blocking-comments.patch
-srcu-3-add-srcu-operations-to-rcutorture.patch
-srcu-3-add-srcu-operations-to-rcutorture-fix.patch
-add-srcu-based-notifier-chains.patch
-add-srcu-based-notifier-chains-cleanup.patch
-srcu-report-out-of-memory-errors.patch
-srcu-report-out-of-memory-errors-fixlet.patch
-cpufreq-make-the-transition_notifier-chain-use-srcu.patch
-rcu-add-module_author-to-rcutorture-module.patch
-rcu-fix-incorrect-description-of-default-for-rcutorture.patch
-rcu-mention-rcu_bh-in-description-of-rcutortures.patch
-rcu-avoid-kthread_stop-on-invalid-pointer-if-rcutorture.patch
-rcu-fix-sign-bug-making-rcu_random-always-return-the-same.patch
-rcu-add-fake-writers-to-rcutorture.patch
-rcu-add-fake-writers-to-rcutorture-tidy.patch
-rcu-refactor-srcu_torture_deferred_free-to-work-for.patch
-rcu-add-rcu_sync-torture-type-to-rcutorture.patch
-rcu-add-rcu_bh_sync-torture-type-to-rcutorture.patch
-rcu-add-sched-torture-type-to-rcutorture.patch
-rcu-simplify-improve-batch-tuning.patch
-rcu-credits-and-maintainers.patch
-the-scheduled-removal-of-some-oss-drivers.patch
-the-scheduled-removal-of-some-oss-drivers-fix.patch
-the-scheduled-removal-of-some-oss-drivers-fix-fix.patch
-kill-sound-oss-_symsc.patch
-kill-include-linux-configh.patch
-pci_module_init-convertion-in-ata_genericc.patch
-pci_module_init-convertion-in-ata_genericc-fix.patch
-pci_module_init-convertion-in-amso1100-driver.patch
-pci_module_init-convertion-for-k8_edacc.patch
-pci_module_init-convertion-in-the-legacy-megaraid-driver.patch
-pci_module_init-convertion-in-olympicc.patch
-pci_module_init-conversion-for-pata_pdc2027x.patch
-pci_module_init-convertion-in-tmscsimc.patch
-pr_debug-aio-use-size_t-length-modifier-in-pr_debug-format-arguments.patch
-pr_debug-configfs-use-size_t-length-modifier-in-pr_debug-format-argument.patch
-pr_debug-sysfs-use-size_t-length-modifier-in-pr_debug-format-arguments.patch
-pr_debug-umem-repair-nonexistant-bh-pr_debug-reference.patch
-pr_debug-tipar-repair-nonexistant-pr_debug-argument-use.patch
-pr_debug-dell_rbu-fix-pr_debug-argument-warnings.patch
-pr_debug-ifb-replace-missing-comma-to-separate-pr_debug-arguments.patch
-pr_debug-trident-use-size_t-length-modifier-in-pr_debug-format-arguments.patch
-pr_debug-check-pr_debug-arguments-arm-fix.patch
-isdn-debug-build-fix.patch
-isdn-more-pr_debug-fixes.patch
-pr_debug-check-pr_debug-arguments.patch
-squash-tcp-warnings.patch

Merged into mainline or a subsystem tree.

+null-dereference-in-fs-jbd-journalc.patch
+irq-fix-avr32-breakage.patch
+mm-use-symbolic-names-instead-of-indices-for-zone-initialisation.patch
+mm-remove-memmap_zone_idx.patch
+fix-menuconfig-build-failure-due-to-missing-stdboolh.patch
+user-struct-irq_chip-instead-of-struct-hw_interrupt_type.patch
+disable-detect_softlockup-for-s390.patch

2.6.19 queue.

+revert-pci-quirk-for-ibm-dock-ii-cardbus-controllers.patch
+revert-nvidiafb-use-generic-ddc-reading.patch

Will be 2.6.19 queue soon if we don't look like fixing a few things.

+ext4-copy.patch
+ext4-rename.patch
+ext4-enable.patch
+jbd2-copy.patch
+jbd2-rename.patch
+jbd2-rename-slab.patch
+jbd2-enable.patch
+jbd2-cleanup.patch
+ext4-extents.patch
+ext4_fsblk_sector_t.patch
+ext4-extents-48bit.patch
+ext4-unitialized-extent-handling.patch
+extents_comment_fix.patch
+64bit_jbd2_core.patch
+sector_t-jbd2.patch
+ext4_48bit_i_file_acl.patch
+64bit-metadata.patch
+ext4_blk_type_from_sector_t_to_ulonglong.patch
+ext4_blk_type_from_sector_t_to_ulonglong-fix.patch
+ext4_remove_sector_t_bits_check.patch
+jbd2_blks_type_from_sector_t_to_ull.patch
+ext4_allow_larger_descriptor_size.patch
+ext4_move_block_number_hi_bits.patch
+ext4-uninline-ext4_get_group_no_and_offset.patch
+ext4-64-bit-divide-fix.patch
+ext4-64-bit-divide-fix-fix.patch
+ext4-rename-logic_sb_block.patch
+ext4-errors-behaviour-fix.patch
+ext4-whitespace-cleanups.patch

ext4

+i386-acpi-build-fix.patch

ACPI fix

+cifs-kconfig-dont-select-connector.patch

CIFS Kconfig sanity

+gregkh-driver-documentation-feature-removal-schedule-typo.patch
+gregkh-driver-driver-core-don-t-ignore-error-returns-from-probing.patch
+gregkh-driver-driver-core-bus-remove-indentation-level.patch
+gregkh-driver-aoe-eliminate-isbusy-message.patch
+gregkh-driver-aoe-update-copyright-date.patch
+gregkh-driver-aoe-remove-unused-nargs-enum.patch
+gregkh-driver-aoe-zero-copy-write-1-of-2.patch
+gregkh-driver-aoe-jumbo-frame-support-1-of-2.patch
+gregkh-driver-aoe-clean-up-printks-via-macros.patch
+gregkh-driver-aoe-jumbo-frame-support-2-of-2.patch
+gregkh-driver-aoe-improve-retransmission-heuristics.patch
+gregkh-driver-aoe-zero-copy-write-2-of-2.patch
+gregkh-driver-aoe-module-parameter-for-device-timeout.patch
+gregkh-driver-aoe-use-bio-bi_idx.patch
+gregkh-driver-aoe-remove-sysfs-comment.patch
+gregkh-driver-aoe-update-driver-version.patch
+gregkh-driver-aoe-revert-printk-macros.patch
+gregkh-driver-aoe-fix-sysfs-warnings.patch
+gregkh-driver-driver-link-sysfs-timing.patch

Driver tree updates.

+w1-kconfig-fix.patch
+fs-partitions-check-add-sysfs-error-handling.patch
+char-nozomi-use-tty_wakeup.patch

Misc fixes agaisnt driver tree.

+drm-fix-error-returns-sysfs-error-handling.patch

DRM sysfs fix.

+git-dvb-build-fix.patch

Fix rejects in git-dvb.patch

+gregkh-i2c-w1-ioremap-balanced-with-iounmap.patch

I2C tree update.

+kill-include-linux-configh-ia64.patch

ia64 cleanup.

-revert-input-make-input_openclose_device-more-robust.patch

Dropped.

+pci_module_init-convertion-in-amso1100-driver.patch
+drivers-infiniband-hw-amso1100-c2_rnicc-fix-a-null-dereference.patch

infiniband things.

+git-input-fixup.patch

Fix rejects in git-input.patch (which isn't here. But it compiles. hrm)

+ata-must-depend-on-block.patch
+pci_module_init-convertion-in-ata_genericc.patch
+pci_module_init-convertion-in-ata_genericc-fix.patch
+pci_module_init-conversion-for-pata_pdc2027x.patch

sata/pata things.

+mtd-maps-add-parameter-to-amd76xrom-to-override-rom-window-size-if-set-incorrectly-by-bios.patch
+mtd-maps-add-parameter-to-amd76xrom-to-override-rom-window-size-if-set-incorrectly-by-bios-tweak.patch
+mtd-chips-support-for-sst-49lf040b-flash-chip.patch
+mtd-maps-support-for-bios-flash-chips-on-intel-esb2-southbridge.patch
+mtd-maps-support-for-bios-flash-chips-on-intel-esb2-southbridge-tidy.patch
+mtd-maps-support-for-bios-flash-chips-on-intel-esb2-southbridge-fix.patch

MTD updates.

+libphy-dont-do-that.patch

netdev build fix (allegedly deadlocks).

-powerpc-cell-spidernet-burst-alignment-patch.patch
-powerpc-cell-spidernet-low-watermark-patch.patch
-powerpc-cell-spidernet-stop-error-printing-patch.patch
-powerpc-cell-spidernet-ethtool-i-version-number-info.patch
-powerpc-cell-spidernet-ethtool-i-version-number.patch
-powerpc-cell-spidernet-refine-locking.patch

Dropped.

+pci_module_init-convertion-in-olympicc.patch
+ibmveth-irq-fix.patch

netdev fixes.

+drivers-atm-no-need-to-return-void.patch

Driver cleanup.

-git-parisc-powerpc-fix.patch

Dropped.

+git-serial-fixup.patch

Actually a pcmcia fix.

+ioremap-balanced-with-iounmap-for-drivers-pcmcia.patch
+export-soc_common_drv_pcmcia_remove-to-allow-modular-pcmcia.patch

pcmcia fixes.

-git-serial-fixup.patch

Dropped.

-serial-fix-uart_bug_txen-test.patch

Unneeded, dropped.

+gregkh-pci-acpipnp-dma-resource-setup-fix.patch
+gregkh-pci-pci-fix-pcie_portdrv_restore_config-undefined-without-config_pm-error.patch
+gregkh-pci-pci-stamp-out-pci_find_-usage-in-fakephp.patch
+gregkh-pci-shpchp-fix-command-completion-check.patch
+gregkh-pci-shpchp-remove-unnecessary-cmd_busy-member-from-struct.patch
+gregkh-pci-pci-hotplug-ioremap-balanced-with-iounmap.patch
+gregkh-pci-pci-improve-pci_msi_supported-comments.patch
+gregkh-pci-pci-update-msi-howto.txt-according-to-pci_msi_supported.patch
+gregkh-pci-change-pci-hotplug-subsystem-maintainer-to-kristen.patch
+gregkh-pci-pci-optionally-sort-device-lists-breadth-first.patch
+gregkh-pci-pci-quirks-fix-the-festering-mess-that-claims-to-handle-ide-quirks.patch
-gregkh-pci-altix-rom-shadowing.patch
+gregkh-pci-altix-initial-acpi-support-rom-shadowing.patch

PCI tree updates.

-revert-gregkh-pci-altix-rom-shadowing.patch
-revert-gregkh-pci-altix-sn-acpi-hotplug-support.patch
-revert-gregkh-pci-altix-add-initial-acpi-io-support.patch

Dropped.

-revert-pci-assign-ioapic-resource-at-hotplug.patch

Dropped.

+quirks-switch-quirks-code-offender-to-use-pci_get-api.patch

PCI fix.

+scsi-scsi_cmnd-convertion-in-sun3-driver.patch
+scsi-scsi_cmnd-conversion-in-qlogicfas408-driver.patch
+scsi-scsi_cmnd-convertion-in-psi240i-driver.patch
+pci_module_init-convertion-in-the-legacy-megaraid-driver.patch
+pci_module_init-convertion-in-tmscsimc.patch
+aic94xx-sata-tag-mask-not-set-correctly.patch
+maintain-module-parameter-name-consistency-with-qla2xxx-qla4xxx.patch
+scsi_libc-use-build_bug_on.patch
+drivers-scsi-dpt_i2oc-remove-dead-code.patch

SCSI fixes.

+gregkh-usb-usb-fix-use-after-free-in-wacom_sys.c.patch
+gregkh-usb-airprime-new-device-id.patch
+gregkh-usb-usb-support-for-bt-on-air-usb-modem-in-cdc-acm.c.patch
+gregkh-usb-usb-suspend-resume-support-for-kaweth.patch
+gregkh-usb-usb-ohci-pnx4008-build-fixes.patch
+gregkh-usb-ueagle-be-suspend-friendly.patch
+gregkh-usb-ueagle-use-interruptible-sleep.patch
+gregkh-usb-ueagle-comestic-changes.patch
+gregkh-usb-usb-fix-cdc-acm-problems-with-hard-irq.patch
+gregkh-usb-usb-unusual_devs-entry-for-nokia-6131.patch
+gregkh-usb-usbatm-fix-tiny-race.patch
+gregkh-usb-speedtch-extended-reach.patch
+gregkh-usb-cxacru-add-the-zte-zxdsl-852.patch
+gregkh-usb-usb-fix-suspend-support-for-usblp.patch
+gregkh-usb-usb-ftdi-elan-fix-sparse-warnings.patch
+gregkh-usb-usb-ehci-hcd-make-ehci_iso_stream-instances-more-persistent.patch
+gregkh-usb-usb-ehci-hcd-periodic-startup-shutdown-centralization-and-hysteresis.patch
+gregkh-usb-usb-ehci-hcd-group-interrupt-endpoint-code-into-one-place.patch
+gregkh-usb-usb-ehci-hcd-group-ehci_iso_sched-functions-into-one-place.patch
+gregkh-usb-usb-ehci-hcd-group-ehci_iso_sched-and-ehci_itd-code.patch
+gregkh-usb-usb-ehci-hcd-group-ehci_sitd-code-in-one-place.patch
+gregkh-usb-usb-ehci-hcd-refactor-sitd-link-patch-code-for-easier-frame-spanning.patch
+gregkh-usb-usb-ehci-hcd-split-scan_periodic-to-reuse-code-for-spanned-completions.patch
+gregkh-usb-usb-ehci-hcd-unify-interval-granularity-and-limit-depth-of-interrupt-tree.patch
+gregkh-usb-usb-ehci-hcd-add-shadow-budget-code.patch
+gregkh-usb-usb-ehci-hcd-activate-shadow-budget-tracking.patch
+gregkh-usb-usb-ehci-hcd-activate-use-of-shadow-budget-for-scheduling-decisions.patch
+gregkh-usb-usb-ehci-hcd-add-fstn-support.patch
+gregkh-usb-usb-ehci-hcd-add-sitd-frame-spanning-support.patch
+gregkh-usb-ehci-hcd-fix-budget_pool-allocation-for-machines-with-multiple-ehci-controllers.patch
+gregkh-usb-usb-usbaudio-correct-bug-caused-by-harmless-underrun-during-playback-setup.patch

USB tree updates.

+fix-gregkh-usb-usbatm-fix-tiny-race.patch
+memory-leak-in-drivers-usb-serial-airprimec.patch
+extract-and-implement-are-bit-field-manipulation-routines.patch
+drivers-usb-net-use-build_bug_on.patch
+drivers-usb-misc-ftdi-elanc-remove-dead-code.patch
+drivers-usb-serial-mos7840c-fix-a-check-after-dereference.patch

USB updates.

-git-watchdog-fixup.patch

Dropped.

+airo-suspend-fix.patch
+prism54-use-build_bug_on.patch

Wireless driver fixes.

+x86_64-overlapping-program-headers-in-physical-addr-space-fix.patch
+sleazy-fpu-feature-i386-support.patch
+add-seccomp_disable_tsc-config-option.patch
+i386-fix-recursive-faults-during-oops-when-current.patch
+x86-remove-default_ldt-and-simplify-ldt-setting.patch
+fix-buggy-mtrr-address-checks.patch
+i386-espfix-cleanup.patch
+x86_64-hot-add-memroy-sratc-fix.patch
+x86_64-add-missing-enter_idle-calls.patch
+x86_64-rename-x86_feature_dtes-to-x86_feature_ds.patch
+add-x86_feature_pebs-and-detection.patch
+i386-rename-x86_feature_dtes-to-x86_feature_ds.patch
+i386-add-x86_feature_pebs-and-detection.patch
+remove-pointless-printk-from-i386-oops-output.patch
+compress-stack-unwinder-output.patch
+x86_64-use-build_bug_on-in-fpu-code.patch
+fix-for-arch-x86_64-pci-makefile-cflags.patch
+i386-math-emu-fix-must_checks.patch

x86 and x86_64 updates.

+touchkit-ps-2-touchscreen-driver-configh.patch
+touchkit-ps-2-touchscreen-driver-regs-fix.patch

Fix touchkit-ps-2-touchscreen-driver.patch

+direct-io-sync-and-invalidate-file-region-when-falling-back-to-buffered-write.patch
+direct-io-sync-and-invalidate-file-region-when-falling-back-to-buffered-write-fixes.patch

Make direct-io fallback-to-buffered work more like direct-io normally does.

+mm-kevent-threads-use-mpol_default.patch
+move-rmap-bug_on-outside-debug_vm.patch
+fix-do_mbind-warning-with-config_migration=n.patch
+memory-page-alloc-minor-cleanups.patch

MM updates

-get-rid-of-zone_table-fix.patch
-get-rid-of-zone_table-fix-2.patch
-get-rid-of-zone_table-fix-4.patch

Folded into get-rid-of-zone_table.patch

-deal-with-cases-of-zone_dma-meaning-the-first-zone-fix.patch

Folded into deal-with-cases-of-zone_dma-meaning-the-first-zone.patch

-optional-zone_dma-in-the-vm-tidy.patch

Folded into optional-zone_dma-in-the-vm.patch

+zoneid-fix-up-calculations-for-zoneid_pgshift.patch

Fix zone rework patches in -mm.

-swap-token-try-to-grab-swap-token-before-the-vm-selects-pages-for-eviction.patch
-swap-token-new-scheme-to-preempt-token.patch
-swap-token-new-scheme-to-preempt-token-tidy.patch
+grab-swap-token-reordered.patch
+new-scheme-to-preempt-swap-token.patch
+new-scheme-to-preempt-swap-token-tidy.patch
+shared-page-table-for-hugetlb-page-v4.patch
+htlb-forget-rss-with-pt-sharing.patch
+mm-arch_free_page-fix.patch
+mm-locks_freed-fix.patch
+mm-add-arch_alloc_page.patch

More MM updates.

-fix-tiacx-on-alpha.patch
-tiacx-fix-attribute-packed-warnings.patch
-tiacx-fix-attribute-packed-warnings-fix.patch
-tiacx-pci-build-fix.patch
-tiacx-ia64-fix.patch
-tiacx-build-fix.patch
-tiacx-sparse-cleanups.patch

Folded into acx1xx-wireless-driver.patch

-swsusp-add-resume_offset-command-line-parameter-rev-2-fix.patch

Folded into swsusp-add-resume_offset-command-line-parameter-rev-2.patch

-swsusp-add-ioctl-for-swap-files-support-fix.patch

Folded into swsusp-add-ioctl-for-swap-files-support.patch

+uml-revert-wrong-patch.patch
+uml-correct-removal-of-pte_mkexec.patch
+uml-readd-forgot-prototype.patch
+uml-make-tt-mode-compile-after-setjmp-related-changes.patch
+uml-make-uml_setjmp-always-safe.patch
+uml-fix-processor-selection-to-exclude-unsupported-processors-and-features.patch
+uml-fix-uname-under-setarch-i386.patch
+uml-declare-in-kconfig-our-partial-lockdep-support.patch
+uml-allow-using-again-x86-x86_64-crypto-code.patch
+uml-asm-offsets-duplication-removal.patch
+uml-remove-duplicate-export.patch
+uml-deprecate-config_mode_tt.patch
+uml-allow-finer-tuning-for-host-vmsplit-setting.patch

UML updates

-edac-new-opteron-athlon64-memory-controller-driver-tidy.patch

Folded into edac-new-opteron-athlon64-memory-controller-driver.patch

-add-address_space_operationsbatch_write-fix.patch

Folded into add-address_space_operationsbatch_write.patch

+pci_module_init-convertion-for-k8_edacc.patch
+kbuild-dont-put-temp-files-in-the-source-tree.patch
+grow_buffers-infinite-loop-fix.patch
+ide-generic-jmicron-fix.patch
+fix-rescan_partitions-to-return-errors-properly.patch
+fix-check_partition-routines.patch
+fix-module-taint-flags-listing-in-oops-panic.patch
+ext3-errors-behaviour-fix.patch
+ext2-errors-behaviour-fix.patch
+tpm-fix-error-handling.patch
+sched-likely-profiling.patch
+serial-uartlite-driver.patch
+serial-uartlite-driver-fix.patch
+invalidate_inode_pages2_range-debug.patch
+x86-microcode-handle-sysfs-error.patch
+apm-share-apm-emulator-between-architectures.patch
+32-bit-compatibility-hdio-ioctls.patch
+pktcdvd-reusability-of-procfs-functions.patch
+pktcdvd-make-procfs-interface-optional.patch
+bitmap-parse-input-from-kernel-and-user-buffers-2.patch
+# drivers-add-lcd-support.patch: Pavel says use fbcon
+drivers-add-lcd-support.patch
+drivers-add-lcd-support-update.patch
+ioremap-balanced-with-iounmap-for-drivers-char-rio-rio_linuxc.patch
+ioremap-balanced-with-iounmap-for-drivers-char-moxac.patch
+ioremap-balanced-with-iounmap-for-drivers-char-istallionc.patch
+sound-oss-btaudioc-ioremap-balanced-with-iounmap.patch
+document-the-core-dump-to-a-pipe-patch.patch
+lockdep-annotate-nfs-nfsd-in-kernel-sockets.patch
+lockdep-annotate-nfs-nfsd-in-kernel-sockets-tidy.patch
+honour-mnt_noexec-for-access.patch
+vm-fix-the-gfp_mask-in-invalidate_complete_page2.patch
+posix-cpu-timers-prevent-signal-delivery-starvation.patch
+remove-unnecessary-check-in-fs-fat-inodec.patch
+d-cache-aliasing-issue-in-__block_prepare_write.patch
+use-linux-ioh-instead-of-asm-ioh.patch
+consolidate-check_signature.patch
+fix-typos-in-mm-shmem_aclc.patch
+ht_irq-must-depend-on-pci.patch
+fs-use-build_bug_on.patch
+dac960-use-memmove-for-overlapping-areas.patch
+lockdep-use-build_bug_on.patch
+fix-lockdep-designtxt.patch
+lockdep-fix-printk-recursion-logic.patch
+kernel-doc-fix-function-name-in-usercopyc.patch
+uaccessh-match-kernel-doc-and-function-names.patch
+kernel-doc-drop-various-inline-qualifiers.patch
+include-linux-typesh-in-linux-nbdh.patch
+kernel-doc-make-parameter-description-indentation-uniform.patch
+dell_rbu-printk-warning-fix.patch

misc.

-generic-bug-handling.patch
-use-generic-bug-for-i386.patch
-use-generic-bug-for-x86-64.patch
-use-generic-bug-for-powerpc.patch
-use-generic-bug-for-powerpc-fix-2.patch
-bug-test-1.patch
+generic-implementatation-of-bug.patch
+generic-implementatation-of-bug-fix.patch
+generic-bug-for-i386.patch
+generic-bug-for-x86-64.patch
+uml-add-generic-bug-support.patch
+use-generic-bug-for-ppc.patch
+bug-test-1.patch

Updated generic-bug implementation.

+log2-implement-a-general-integer-log2-facility-in-the-kernel.patch
+log2-implement-a-general-integer-log2-facility-in-the-kernel-fix.patch
+log2-alter-roundup_pow_of_two-so-that-it-can-use-a-ilog2-on-a-constant.patch
+log2-alter-get_order-so-that-it-can-make-use-of-ilog2-on-a-constant.patch
+log2-provide-ilog2-fallbacks-for-powerpc.patch

generic log2() implementation.

+fs-cache-provide-a-filesystem-specific-syncable-page-bit-ext4.patch

Fix ext4 for fs-cache-provide-a-filesystem-specific-syncable-page-bit.patch

+nfs-use-local-caching-configh.patch

Clean up nfs-use-local-caching.patch

+fs-cache-cachefiles-a-cache-that-backs-onto-a-mounted-filesystem-log2-fix.patch

log2() fix.

+char-mxser_new-revert-spin_lock-changes.patch
+char-mxser_new-remove-request-for-testers-line.patch
+char-mxser_new-debug-printk-dependent-on-debug.patch
+char-mxser_new-alter-license-terms.patch
+char-mxser_new-code-upside-down.patch
+char-mxser_new-cmspar-is-defined.patch
+char-remove-unneded-termbits-redefinitions-mxser_new.patch
+char-mxser_new-eliminate-tty-ldisc-deref.patch
+char-mxser_new-testbit-for-bit-testing.patch
+char-mxser_new-correct-fail-paths.patch
+char-mxser_new-dont-check-tty_unregister-retval.patch
+char-mxser_new-compress-isa-finding.patch
+char-mxser_new-register-tty-devices-on-the-fly.patch
+char-mxser_new-compact-structures-round2.patch
+char-mxser_new-reverse-if-else-paths-patch.patch
+char-mxser_new-comments-cleanup.patch

More updates to the updated mxser driver.

-readahead-sysctl-parameters-fix.patch

Folded into readahead-sysctl-parameters.patch

-readahead-state-based-method-aging-accounting-apply-type-enum-zone_type-readahead.patch

Folded into readahead-state-based-method-aging-accounting.patch

-readahead-call-scheme-fix.patch

Folded into readahead-call-scheme.patch

+reiser4-configh.patch

reiser4 cleanup

+reiser4-format-subversion-numbers-heir-set-and-file-conversion.patch
+reiser4-cleanups-in-lzo-compression-library.patch

reiser4 updates.

+ioremap-balanced-with-iounmap-for-drivers-video-virgefb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-vesafb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-tridentfb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-tgafb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-stifb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-retz3fb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-pvr2fb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-platinumfb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-offb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-macfb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-hpfb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-fm2fb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-ffb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-cyberfb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-cirrusfb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-atyfb_base.patch
+ioremap-balanced-with-iounmap-for-drivers-video-atafb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-amifb.patch
+ioremap-balanced-with-iounmap-for-drivers-video-S3triofb.patch

ioremap() fixes in fbdev drivers.

-statistics-infrastructure-prerequisite-timestamp-fix.patch

Folded into statistics-infrastructure-prerequisite-timestamp.patch

-statistics-infrastructure-update-9.patch

Folded into statistics-infrastructure.patch

-statistics-infrastructure-exploitation-zfcp-sched_clock-fix.patch

Folded into statistics-infrastructure-exploitation-zfcp.patch

+dio-centralize-completion-in-dio_complete.patch
+dio-call-blk_run_address_space-once-per-op.patch
+dio-formalize-bio-counters-as-a-dio-reference-count.patch
+dio-remove-duplicate-bio-wait-code.patch
+dio-only-call-aio_complete-after-returning-eiocbqueued.patch

drirect-io cleanups and fixes

+fdtable-delete-pointless-code-in-dup_fd.patch
+fdtable-make-fdarray-and-fdsets-equal-in-size.patch
+fdtable-remove-the-free_files-field.patch
+fdtable-implement-new-pagesize-based-fdtable-allocator.patch

Redo the fdtable code.

+gtod-exponential-update_wall_time.patch
+gtod-persistent-clock-support-core.patch
+gtod-persistent-clock-support-i386.patch
+time-uninline-jiffiesh.patch
+time-fix-msecs_to_jiffies-bug.patch
+time-fix-timeout-overflow.patch
+cleanup-uninline-irq_enter-and-move-it-into-a-function.patch
+dynticks-extend-next_timer_interrupt-to-use-a-reference-jiffie.patch
+hrtimers-namespace-and-enum-cleanup.patch
+hrtimers-clean-up-locking.patch
+hrtimers-state-tracking.patch
+hrtimers-clean-up-callback-tracking.patch
+hrtimers-move-and-add-documentation.patch
+clockevents-core.patch
+clockevents-drivers-for-i386.patch
+high-res-timers-core.patch
+gtod-mark-tsc-unusable-for-highres-timers.patch
+dynticks-core.patch
+dynticks-add-nohz-stats-to-proc-stat.patch
+dynticks-i386-arch-code.patch
+high-res-timers-dynticks-enable-i386-support.patch
+debugging-feature-timer-stats.patch

hrtimers and dynamic ticks.

+kevent-timer-notifications-fix.patch
+kevent-fix-socket-notifications.patch
+kevent-remove-mmap-interface.patch

kevent updates




All 766 patches:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/patch-list


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel-announce" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Arjan van de Ven
2006-10-10 07:20:00 UTC
Permalink
Post by Andrew Morton
+htlb-forget-rss-with-pt-sharing.patch
if it's ok to ignore RSS, can we consider the shared pagetables for
normal pages patch? It saves quite a bit of memory on even desktop
workloads as well as avoiding several (soft) pagefaults.

So.. what does RSS actually mean? Can we ignore it somewhat for
shared-readonly mappings ?
--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Andrew Morton
2006-10-10 07:45:26 UTC
Permalink
On Tue, 10 Oct 2006 09:20:00 +0200
Post by Arjan van de Ven
Post by Andrew Morton
+htlb-forget-rss-with-pt-sharing.patch
Which I didn't write. cc's added.
Post by Arjan van de Ven
if it's ok to ignore RSS,
We'd prefer not to. But what's the alternative?
Post by Arjan van de Ven
can we consider the shared pagetables for
normal pages patch?
Has been repeatedly considered, but Hugh keeps finding bugs in it.
Post by Arjan van de Ven
It saves quite a bit of memory on even desktop
workloads as well as avoiding several (soft) pagefaults.
So.. what does RSS actually mean? Can we ignore it somewhat for
shared-readonly mappings ?
We'd prefer to go the other way, and implement RLIMIT_RSS wouldn't we?
Arjan van de Ven
2006-10-10 08:03:21 UTC
Permalink
Post by Andrew Morton
Post by Arjan van de Ven
if it's ok to ignore RSS,
We'd prefer not to. But what's the alternative?
it's a good question; today (2.6.18) we have some defacto behavior of
RSS; 2.6.19-rc1-mm1 has a somewhat different one. Either can be entirely
valid; and we can obviously implement either. We can go even further and
remove more from RSS to help save memory and pagefaults (both help
desktop performance) by going the shared pagetable road
Post by Andrew Morton
Post by Arjan van de Ven
can we consider the shared pagetables for
normal pages patch?
Has been repeatedly considered, but Hugh keeps finding bugs in it.
the latest one I tried looked relatively simple (earlier ones were very
complex) so maybe Hugh can find time to give it another lookover?
Post by Andrew Morton
Post by Arjan van de Ven
It saves quite a bit of memory on even desktop
workloads as well as avoiding several (soft) pagefaults.
So.. what does RSS actually mean? Can we ignore it somewhat for
shared-readonly mappings ?
We'd prefer to go the other way, and implement RLIMIT_RSS wouldn't we?
Well... that again depends on how we define RSS. implementing the rlimit
doesn't mean we can't NOT count certain things (like the hugetlb pages
in the patch above, or shared read only pagecache pages) to be part of
it. It's a fundamental "what does it mean" thing.
You can argue that RSS means "all memory that the application has in
it's address space", you can argue "all such memory except a few cases",
you can argue "all memory that is private/exclusive to the
application"...
This is not a pointless piss-in-the-wind discussion; unless we define
rather specific what it really means, the RLIMIT doesn't mean anything
either.

We need to consider at least if any of the following are part of rss:
* VM_IO io mmaped device stuff
* Non-linear mappings
* Shared hugetlb memory that shares pagetables
* Shared hugetlb memory
* Hugetlb memory in general
* Shared normal memory that shares pagetables
* Shared normal memory (file backed; eg pagecache)
* Shared normal memory (anonymous/non-file-backed)
* Sysv/ipc shared memory
* Not shared normal memory

I don't think posix or anything else helps us here so we can vote or
otherwise reason which make sense and which don't. I hope the outcome is
reasonably consistent ;)

I know the desktop guys at least consider RSS useless as measure of "how
much memory does my desktop app take"; especially since they have many
shared libraries and they consider it unfair that each app pays the full
price in terms of RSS for those. So personally I'm not unhappy with a
definition that comes down to "all memory that's private to the app";
although it is a change from what 2.6.18 does.
--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Peter Zijlstra
2006-10-10 13:14:47 UTC
Permalink
Post by Arjan van de Ven
Post by Andrew Morton
Post by Arjan van de Ven
if it's ok to ignore RSS,
We'd prefer not to. But what's the alternative?
it's a good question; today (2.6.18) we have some defacto behavior of
RSS; 2.6.19-rc1-mm1 has a somewhat different one. Either can be entirely
valid; and we can obviously implement either. We can go even further and
remove more from RSS to help save memory and pagefaults (both help
desktop performance) by going the shared pagetable road
Post by Andrew Morton
Post by Arjan van de Ven
So.. what does RSS actually mean? Can we ignore it somewhat for
shared-readonly mappings ?
We'd prefer to go the other way, and implement RLIMIT_RSS wouldn't we?
Well... that again depends on how we define RSS. implementing the rlimit
doesn't mean we can't NOT count certain things (like the hugetlb pages
in the patch above, or shared read only pagecache pages) to be part of
it. It's a fundamental "what does it mean" thing.
You can argue that RSS means "all memory that the application has in
it's address space", you can argue "all such memory except a few cases",
you can argue "all memory that is private/exclusive to the
application"...
This is not a pointless piss-in-the-wind discussion; unless we define
rather specific what it really means, the RLIMIT doesn't mean anything
either.
* VM_IO io mmaped device stuff
* Non-linear mappings
* Shared hugetlb memory that shares pagetables
* Shared hugetlb memory
* Hugetlb memory in general
* Shared normal memory that shares pagetables
* Shared normal memory (file backed; eg pagecache)
* Shared normal memory (anonymous/non-file-backed)
* Sysv/ipc shared memory
* Not shared normal memory
I don't think posix or anything else helps us here so we can vote or
otherwise reason which make sense and which don't. I hope the outcome is
reasonably consistent ;)
I know the desktop guys at least consider RSS useless as measure of "how
much memory does my desktop app take"; especially since they have many
shared libraries and they consider it unfair that each app pays the full
price in terms of RSS for those. So personally I'm not unhappy with a
definition that comes down to "all memory that's private to the app";
although it is a change from what 2.6.18 does.
So we have three cases where RSS matters besides presenting a number to
user-space;
- shared page tables
- containers
- rlimit

Preferably they will all share a common definition of what RSS is; since
containers must account shared pages somehow (not doing so would open up
a large hole to DoS the other containers) and the container case can be
argued to be an extension of the rlimit case we cannot just ignore them.

As then what to do with them, I've yet to figure out. Some random bit
floating in my brain;
- VM_IO can be discarted, its not actual memory

- hugetlb is memory too, so I'd not special case this (other than the
different unit of accounting)

- shared mapped pages could be accounted on vma level, since both
containers have access to the same file, there is already an imbalance,
so I'd not worry about the 1%-99% usage scenario here.

- regular, non mapped, pagecache pages however have no owner - what to
do. (fake vma - which would result in each container paying equally for
all these pages?)

Anyway, I'd rather not break RSS twice, once now because we don't quite
know what to do, and later when we do get an acceptable mm container and
have to include shared memory in one way or the other.
Arjan van de Ven
2006-10-10 16:13:10 UTC
Permalink
Post by Peter Zijlstra
Post by Arjan van de Ven
* VM_IO io mmaped device stuff
* Non-linear mappings
* Shared hugetlb memory that shares pagetables
* Shared hugetlb memory
* Hugetlb memory in general
* Shared normal memory that shares pagetables
* Shared normal memory (file backed; eg pagecache)
* Shared normal memory (anonymous/non-file-backed)
* Sysv/ipc shared memory
* Not shared normal memory
So we have three cases where RSS matters besides presenting a number to
user-space;
- shared page tables
- containers
- rlimit
Preferably they will all share a common definition of what RSS is; since
containers must account shared pages somehow (not doing so would open up
a large hole to DoS the other containers) and the container case can be
argued to be an extension of the rlimit case we cannot just ignore them.
As then what to do with them, I've yet to figure out. Some random bit
floating in my brain;
- VM_IO can be discarted, its not actual memory
agreed (although I think we do count it today; this is half of what
makes X look so bloated, next to firefox ;)
Post by Peter Zijlstra
- hugetlb is memory too, so I'd not special case this (other than the
different unit of accounting)
agreed again; personally I don't think hugetlb memory should be special;
especially with all the libhugetlbfs work on making it real easy for
apps to use; the more it's used the more people would notice something
funky with it.
Post by Peter Zijlstra
- shared mapped pages could be accounted on vma level, since both
containers have access to the same file, there is already an imbalance,
so I'd not worry about the 1%-99% usage scenario here.
or one level below; if you count it in the actual PTE page then the
sharing case will "just work". It's a trick question; do you count it as
100% or do you count it as 100% / number of sharers.
Post by Peter Zijlstra
- regular, non mapped, pagecache pages however have no owner - what to
do. (fake vma - which would result in each container paying equally for
all these pages?)
if they're clean I wouldn't count them to anything actually
Post by Peter Zijlstra
Anyway, I'd rather not break RSS twice, once now because we don't quite
know what to do, and later when we do get an acceptable mm container and
have to include shared memory in one way or the other.
RSS of a container versus RSS of a process is an interesting question
for sure ;)
Eric W. Biederman
2006-10-10 23:54:16 UTC
Permalink
Post by Arjan van de Ven
Post by Peter Zijlstra
Post by Arjan van de Ven
* VM_IO io mmaped device stuff
* Non-linear mappings
* Shared hugetlb memory that shares pagetables
* Shared hugetlb memory
* Hugetlb memory in general
* Shared normal memory that shares pagetables
* Shared normal memory (file backed; eg pagecache)
* Shared normal memory (anonymous/non-file-backed)
* Sysv/ipc shared memory
* Not shared normal memory
There is a concept related to RSS that is very interesting. The
minimum RSS that a process needs to keep from thrashing.

It can be shown that if your RSS rlimit is greater that your minimum
RSS your process will never thrash.

One thing that older paging algorithms would try and do when there
was memory pressure was to dynamically discover an applications
minimum RSS, and if they couldn't meet it swap that process out,
because the program can't make progress anyway.

A per process not a per application RSS fails to model multiple
process applications but the concepts are sound.

So since VM_IO does not have any effect on paging it should not
be counted but if you were a stickler the page tables from the
VM_IO should be counted.

The other thing to note is even if you RSS is just above your minimum
RSS that will only result in your process having pages bounce in and
out of the page cache (not necessarily to disk). So while it will
increase minor faults a restrictive RSS is not a problem when
you have excess resources.
Post by Arjan van de Ven
Post by Peter Zijlstra
- shared mapped pages could be accounted on vma level, since both
containers have access to the same file, there is already an imbalance,
so I'd not worry about the 1%-99% usage scenario here.
or one level below; if you count it in the actual PTE page then the
sharing case will "just work". It's a trick question; do you count it as
100% or do you count it as 100% / number of sharers.
I'm not certain I even want to follow the logic here. The answer is
clear.

For processes shared pages are not special.

For computing a container RSS shared pages need to be counted the
first time they are mapped by any process in a container, and
uncounted the last time they are unmapped by a process in a container.
With rmap we have the data structures necessary to do the accounting,
although it might be a bit of a pain.
Post by Arjan van de Ven
Post by Peter Zijlstra
- regular, non mapped, pagecache pages however have no owner - what to
do. (fake vma - which would result in each container paying equally for
all these pages?)
if they're clean I wouldn't count them to anything actually
If the pages aren't mapped they aren't part of the resident set size.

At least not until we start worrying about per container or per
process splits of the page cache, and there are clearly good reasons
to avoid that.
Post by Arjan van de Ven
Post by Peter Zijlstra
Anyway, I'd rather not break RSS twice, once now because we don't quite
know what to do, and later when we do get an acceptable mm container and
have to include shared memory in one way or the other.
RSS of a container versus RSS of a process is an interesting question
for sure ;)
Agreed. Historically it was much more interesting because we didn't
have rmap so telling if your set of processes had mapped the page
already is a challenge. At this point unless the performance of
the accounting is to much it should just be a simple counting problem.

The real challenge is doing a decent job of picking the appropriate
page to unmap when a new mapping by a process would exceed the rss
limit. That is the one piece of the puzzle we have never implemented
:(

You can declare success when you can push one container into heavy
swapping and the rest of the containers are still running fine.

Eric
Arjan van de Ven
2006-10-11 08:47:42 UTC
Permalink
Post by Eric W. Biederman
For processes shared pages are not special.
depends on what question you want to answer with RSS.
If the question is "workload working set size" then you are right. If
the question is "how much ram does my application cause to be used" the
answer is FAR less clear....

You seem to have an implicit definition on what RSS should mean; but
it's implicit. Mind making an explicit definition of what RSS should be
in your opinion? I think that's the biggest problem we have right now;
several people have different ideas about what it should/could be, and
as such we're not talking about the same thing. Lets first agree/specify
what it SHOULD mean, and then we can figure out what gets counted for
that ;)
Eric W. Biederman
2006-10-11 12:07:28 UTC
Permalink
Post by Arjan van de Ven
Post by Eric W. Biederman
For processes shared pages are not special.
Actually the above is not quite true you can map a shared page twice
into the same process but in practice it rarely happens.
Post by Arjan van de Ven
depends on what question you want to answer with RSS.
If the question is "workload working set size" then you are right. If
the question is "how much ram does my application cause to be used" the
answer is FAR less clear....
There are two basic concerns. How do you keep an application from
going crazy and trashing the rest of your system? A question on what
number do you need to implement a resource limit.

The other question is how do I get good information so I can
effectively understand what kind of resources a given
application is using and hopefully predict what kind of
resources that application will use in the future.

The last time I tried to answer the question "how much ram does my
application cause to be used" I had to slowly start up additional
copies and watch how much free memory decreased.

Having a couple of additional counts in addition to RSS would probably
be the most help in understanding resource usage. Counting the number
of private dirty resident pages would be interesting. As would
counting the number of pages that are resident in the process but not
resident in any other process.

It might also help to have a per page report on which file it backs
and how many user it has. Unfortunately that is totally overwhelming
detail and the act of reporting it would quite likely change the
result as it would take so much memory to store the result.
Post by Arjan van de Ven
You seem to have an implicit definition on what RSS should mean; but
it's implicit. Mind making an explicit definition of what RSS should be
in your opinion? I think that's the biggest problem we have right now;
several people have different ideas about what it should/could be, and
as such we're not talking about the same thing. Lets first agree/specify
what it SHOULD mean, and then we can figure out what gets counted for
that ;)
Well I tried to defined it in terms of what you can use it for.

I would define the resident set size as the total number of bytes
of physical RAM that a process (or set of processes) is using,
irrespective of the rest of the system.

By physical RAM I mean that if a single page (if shared) is used twice
by a single process it will be counted only once. This definition
works well for shared and private pages.

COW pages are probably the most subtle. They are both shared and not
shared. Since that sharing is application visible and at application
discretion I would count COW pages as shared until they that sharing
is broken, and then I would count them as private pages.

The principle is that you don't find the ``owner'' of a page and
charge the page to the ``owner''. Instead you find the users
of a page and charge all of them for the page exactly once.

By and large most of that usage comes from pages in the page tables
so they are the most interesting items to count.

Things like file descriptors, inodes, page tables and other kernel
memory are interesting but are generally overshadowed by the page
table users, and generally kernel data structures don't have reverse
maps which makes it difficult to charge all of the users.

So I think the counting should be primarily about what is mapped into
the page tables. But other things can be added as is appropriate or
easy.

The practical effect should be that an application that needs more
pages than it's specified RSS to avoid thrashing should thrash but
it shouldn't take the rest of the system with it.


The biggest instance of system memory that an application does not
seem to have true control over is the page cache. Some kind of limit
that prevents one application from destroying everything another
application doing seems interesting. But I expect the solution there
are I/O limits and not memory limits.

Eric
Arjan van de Ven
2006-10-11 13:55:13 UTC
Permalink
Post by Eric W. Biederman
Post by Arjan van de Ven
Post by Eric W. Biederman
For processes shared pages are not special.
Actually the above is not quite true you can map a shared page twice
into the same process but in practice it rarely happens.
yeah I'm entirely fine with ignoring that case (or making the person who
does it pay for it :)
Post by Eric W. Biederman
Post by Arjan van de Ven
depends on what question you want to answer with RSS.
If the question is "workload working set size" then you are right. If
the question is "how much ram does my application cause to be used" the
answer is FAR less clear....
There are two basic concerns. How do you keep an application from
going crazy and trashing the rest of your system? A question on what
number do you need to implement a resource limit.
yet at the same time if 2 apps mmap a shared file, and app 1 keeps it in
pagecache, it doesn't cause app2 to trash, or rather, it's not like if
app 2 did NOT have the page from that file, the system wouldn't trash.
Post by Eric W. Biederman
Post by Arjan van de Ven
You seem to have an implicit definition on what RSS should mean; but
it's implicit. Mind making an explicit definition of what RSS should be
in your opinion? I think that's the biggest problem we have right now;
several people have different ideas about what it should/could be, and
as such we're not talking about the same thing. Lets first agree/specify
what it SHOULD mean, and then we can figure out what gets counted for
that ;)
Well I tried to defined it in terms of what you can use it for.
I would define the resident set size as the total number of bytes
of physical RAM that a process (or set of processes) is using,
irrespective of the rest of the system.
So I think the counting should be primarily about what is mapped into
the page tables. But other things can be added as is appropriate or
easy.
The practical effect should be that an application that needs more
pages than it's specified RSS to avoid thrashing should thrash but
it shouldn't take the rest of the system with it.
so by your definition, hugepages are part of RSS.

Ken: what is your definition of RSS ?
Chen, Kenneth W
2006-10-11 17:15:39 UTC
Permalink
Arjan van de Ven wrote on Wednesday, October 11, 2006 6:55 AM
Post by Arjan van de Ven
Post by Eric W. Biederman
Well I tried to defined it in terms of what you can use it for.
I would define the resident set size as the total number of bytes
of physical RAM that a process (or set of processes) is using,
irrespective of the rest of the system.
So I think the counting should be primarily about what is mapped into
the page tables. But other things can be added as is appropriate or
easy.
The practical effect should be that an application that needs more
pages than it's specified RSS to avoid thrashing should thrash but
it shouldn't take the rest of the system with it.
so by your definition, hugepages are part of RSS.
Ken: what is your definition of RSS ?
I'm more inclined to define RSS as "how much ram does my application
cause to be used". To monitor process's working set size, We already
have /proc/<pid>/smaps. Whether we can use working set size in an
intelligent way in mm is an interesting question. Though, so far such
accounting is not utilized at all.

- Ken
Benjamin LaHaise
2006-10-11 22:36:34 UTC
Permalink
Post by Chen, Kenneth W
I'm more inclined to define RSS as "how much ram does my application
cause to be used". To monitor process's working set size, We already
have /proc/<pid>/smaps. Whether we can use working set size in an
intelligent way in mm is an interesting question. Though, so far such
accounting is not utilized at all.
If that is the case, it would make sense to account such things as page
tables and other kernel allocations against the RSS, which would be useful.
That said, it's possible to keep semantics fairly close to those currently
implemented by tracking RSS differently for shared vs private areas --
those vmas which are shared could be placed on a list and then summed when
RSS is read. That said, I'm not sure it is a good idea, as the cost of
obtaining RSS for tools like top is exactly why we have the current
counters maintained to provide O(1) semantics.

All of the old semantics are covered by smaps, though, so I'd agree with
any changes to make RSS reflect allocations incurred by this process.

-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <***@kvack.org>.
Miguel Ojeda
2006-10-10 07:31:23 UTC
Permalink
Post by Andrew Morton
+# drivers-add-lcd-support.patch: Pavel says use fbcon
+drivers-add-lcd-support.patch
+drivers-add-lcd-support-update.patch
Has the # a special meaning?

I'm going to work on offering the fbcon feature as Pavel requested. We
suggested 2 ways.

Pavel's idea: Change the driver so the cfag12864b module will be just
a framebuffer device, removing access through /dev/cfag12864b.

My idea: Code a new module called "fbcfag12864b", which will depend on
cfag12864b and will be the framebuffer device. This way we have both
devices, and they doesn't affect each other as they are different
things. So the ks0108 and cfag12864b can stay without any changes.
Also, if we finally decide we don't want the raw cfag12864b module, it
is easy to remove it from the cfag12864b and the fbcafg12864b will
continue working.

Is there anyone who can decide which idea is better? If not, I will
code it my way. Also, if the Pavel's idea will be the chosen one, it
will be easier to put the fbcfag12864b code into the cfag12864b rather
than the opposite.
Andrew Morton
2006-10-10 08:10:52 UTC
Permalink
On Tue, 10 Oct 2006 07:31:23 +0000
Post by Miguel Ojeda
Post by Andrew Morton
+# drivers-add-lcd-support.patch: Pavel says use fbcon
+drivers-add-lcd-support.patch
+drivers-add-lcd-support-update.patch
Has the # a special meaning?
It's a comment separator ;)
Post by Miguel Ojeda
I'm going to work on offering the fbcon feature as Pavel requested.
Thanks. It does sound like making the thing an fbdev is the right way to go.
Post by Miguel Ojeda
suggested 2 ways.
Pavel's idea: Change the driver so the cfag12864b module will be just
a framebuffer device, removing access through /dev/cfag12864b.
My idea: Code a new module called "fbcfag12864b", which will depend on
cfag12864b and will be the framebuffer device. This way we have both
devices, and they doesn't affect each other as they are different
things. So the ks0108 and cfag12864b can stay without any changes.
Also, if we finally decide we don't want the raw cfag12864b module, it
is easy to remove it from the cfag12864b and the fbcafg12864b will
continue working.
Is there anyone who can decide which idea is better? If not, I will
code it my way. Also, if the Pavel's idea will be the chosen one, it
will be easier to put the fbcfag12864b code into the cfag12864b rather
than the opposite.
I'd have thought that once the device is accessible as an fbdev, there's so
much other software and kernel infrastructure to support that, there's
little point in offering an alternative way of presenting the device to
userspace.
Miguel Ojeda
2006-10-10 09:57:11 UTC
Permalink
Post by Andrew Morton
On Tue, 10 Oct 2006 07:31:23 +0000
Post by Miguel Ojeda
Post by Andrew Morton
+# drivers-add-lcd-support.patch: Pavel says use fbcon
+drivers-add-lcd-support.patch
+drivers-add-lcd-support-update.patch
Has the # a special meaning?
It's a comment separator ;)
Ouch. Right.
Post by Andrew Morton
Post by Miguel Ojeda
I'm going to work on offering the fbcon feature as Pavel requested.
Thanks. It does sound like making the thing an fbdev is the right way to go.
Yep, such way will let us to run a xgl server through some ascii video
lib and display it on the 128x64 LCD fb device ;)
Post by Andrew Morton
Post by Miguel Ojeda
suggested 2 ways.
Pavel's idea: Change the driver so the cfag12864b module will be just
a framebuffer device, removing access through /dev/cfag12864b.
My idea: Code a new module called "fbcfag12864b", which will depend on
cfag12864b and will be the framebuffer device. This way we have both
devices, and they doesn't affect each other as they are different
things. So the ks0108 and cfag12864b can stay without any changes.
Also, if we finally decide we don't want the raw cfag12864b module, it
is easy to remove it from the cfag12864b and the fbcafg12864b will
continue working.
Is there anyone who can decide which idea is better? If not, I will
code it my way. Also, if the Pavel's idea will be the chosen one, it
will be easier to put the fbcfag12864b code into the cfag12864b rather
than the opposite.
I'd have thought that once the device is accessible as an fbdev, there's so
much other software and kernel infrastructure to support that, there's
little point in offering an alternative way of presenting the device to
userspace.
Allright, I will code it as the fbcfag12864b module, and when it will
be working fine, I will remove the raw device of the cfag12864b and if
the code of fbcfag12864b results really small/trivial, I will put it
in the cfag12864b module. Then I will send you the "update patch 2".

Give me a couple of days or so :)

Thanks you,
Miguel Ojeda
Jeremy Fitzhardinge
2006-10-10 18:25:21 UTC
Permalink
Post by Miguel Ojeda
Allright, I will code it as the fbcfag12864b module, and when it will
be working fine, I will remove the raw device of the cfag12864b and if
the code of fbcfag12864b results really small/trivial, I will put it
in the cfag12864b module. Then I will send you the "update patch 2".
Give me a couple of days or so :)
Just as an aside, would it be possible to give this module a name which
doesn't look like a changeset ID? It's only the 'g' which gives it away...

J
Theodore Tso
2006-10-10 12:19:50 UTC
Permalink
Post by Andrew Morton
- Grab updated e2fsprogs from
ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim/
- It's still mke2fs -j /dev/hda1
- mount /dev/hda1 /wherever -t ext4dev
- To enable extents,
mount /dev/hda1 /wherever -t ext4dev -o extents
Looks like you didn't take the updated patch from Shaggy which
requires that you use tune2fs -O extents first? (This requires the
e2fsprogs-interim patches.)

The plan is that mount -o extents is not going to be the long-term way
that extents will be enabled. I can imagine a -o noextents option,
which might be used with remount to do an on-line rollback from
extents to non-extents, but normally you shouldn't need to use a mount
option to enable a feature that are filesystem format-related. Those
should be implied by the appropriate flags in the superblock.

Mount -o nobh is a different story, since that's just a implementation
detail --- although for ext4, maybe we should just make nobh a
default, since that way more people will test it and hopefully,
eventually nobh will be the only way of doing things, right?
Post by Andrew Morton
Making the journal larger than the mke2fs default often helps
performance with metadata-intensive workloads.
The default was increased significantly in e2fsprogs 1.40; if someone
who has their favorite metadata-intesive benchmark could test and see
if we should be using even larger defaults for certain "mke2fs -T
<workload-type>" configurations, I'd really appreciate it.

- Ted
Arjan van de Ven
2006-10-10 12:26:03 UTC
Permalink
Post by Theodore Tso
Mount -o nobh is a different story, since that's just a implementation
detail --- although for ext4, maybe we should just make nobh a
default, since that way more people will test it and hopefully,
eventually nobh will be the only way of doing things, right?
imo it should be that even for ext3!
Andrew Morton
2006-10-10 16:21:34 UTC
Permalink
On Tue, 10 Oct 2006 08:19:50 -0400
Post by Theodore Tso
Post by Andrew Morton
- Grab updated e2fsprogs from
ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim/
- It's still mke2fs -j /dev/hda1
- mount /dev/hda1 /wherever -t ext4dev
- To enable extents,
mount /dev/hda1 /wherever -t ext4dev -o extents
Looks like you didn't take the updated patch from Shaggy which
requires that you use tune2fs -O extents first?
Nope. That would have made extents inaccessible with the e2fsprogs I was
using and I didn't have time to test e2fsprogs-interim.
Post by Theodore Tso
(This requires the
e2fsprogs-interim patches.)
OK.
Post by Theodore Tso
The plan is that mount -o extents is not going to be the long-term way
that extents will be enabled. I can imagine a -o noextents option,
which might be used with remount to do an on-line rollback from
extents to non-extents, but normally you shouldn't need to use a mount
option to enable a feature that are filesystem format-related. Those
should be implied by the appropriate flags in the superblock.
Mount -o nobh is a different story, since that's just a implementation
detail --- although for ext4, maybe we should just make nobh a
default, since that way more people will test it and hopefully,
eventually nobh will be the only way of doing things, right?
nobh might be inefficient with large PAGE_SIZE and small files (or just
small writes).
Post by Theodore Tso
Post by Andrew Morton
Making the journal larger than the mke2fs default often helps
performance with metadata-intensive workloads.
The default was increased significantly in e2fsprogs 1.40; if someone
who has their favorite metadata-intesive benchmark could test and see
if we should be using even larger defaults for certain "mke2fs -T
<workload-type>" configurations, I'd really appreciate it.
- Ted
Michal Piotrowski
2006-10-10 13:10:10 UTC
Permalink
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
Kernel 2.6.19-rc1-mm1 + Neil's avoid_lockdep_warning_in_md.patch
(http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0642.html)

(I'll try to reproduce this without Neil's patch).

echo shutdown > /sys/power/disk; echo disk > /sys/power/state

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.19-rc1-mm1 #4
-------------------------------------------------------
bash/2404 is trying to acquire lock:
((cpu_chain).rwsem){..--}, at: [<c012e6a0>]
blocking_notifier_call_chain+0x11/0x2d

but task is already holding lock:
(workqueue_mutex){--..}, at: [<c0313dea>] mutex_lock+0x1c/0x1f

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (workqueue_mutex){--..}:
[<c013a4d0>] add_lock_to_list+0x5c/0x7a
[<c013c5f5>] __lock_acquire+0x9f3/0xaef
[<c013ca5b>] lock_acquire+0x71/0x91
[<c0313baf>] __mutex_lock_slowpath+0xd2/0x2f1
[<c0313dea>] mutex_lock+0x1c/0x1f
[<c0131767>] workqueue_cpu_callback+0x109/0x1ff
[<c012e30f>] notifier_call_chain+0x20/0x31
[<c012e6ac>] blocking_notifier_call_chain+0x1d/0x2d
[<c014126b>] _cpu_down+0x48/0x1ff
[<c01415fc>] disable_nonboot_cpus+0x9b/0x12f
[<c0146bd3>] prepare_processes+0xf/0x73
[<c0146e78>] pm_suspend_disk+0xa/0x11c
[<c01460d4>] enter_state+0x5a/0x185
[<c0146285>] state_store+0x86/0x9c
[<c01ad6dc>] subsys_attr_store+0x20/0x25
[<c01ad7df>] sysfs_write_file+0xaa/0xd3
[<c0177189>] vfs_write+0xcd/0x179
[<c0177834>] sys_write+0x3b/0x71
[<c0103241>] sysenter_past_esp+0x56/0x8d
[<ffffffff>] 0xffffffff

-> #0 ((cpu_chain).rwsem){..--}:
[<c013bbce>] print_circular_bug_tail+0x30/0x64
[<c013c52c>] __lock_acquire+0x92a/0xaef
[<c013ca5b>] lock_acquire+0x71/0x91
[<c0138202>] down_read+0x28/0x3c
[<c012e6a0>] blocking_notifier_call_chain+0x11/0x2d
[<c014138b>] _cpu_down+0x168/0x1ff
[<c01415fc>] disable_nonboot_cpus+0x9b/0x12f
[<c0146bd3>] prepare_processes+0xf/0x73
[<c0146e78>] pm_suspend_disk+0xa/0x11c
[<c01460d4>] enter_state+0x5a/0x185
[<c0146285>] state_store+0x86/0x9c
[<c01ad6dc>] subsys_attr_store+0x20/0x25
[<c01ad7df>] sysfs_write_file+0xaa/0xd3
[<c0177189>] vfs_write+0xcd/0x179
[<c0177834>] sys_write+0x3b/0x71
[<c0103241>] sysenter_past_esp+0x56/0x8d
[<ffffffff>] 0xffffffff

other info that might help us debug this:

2 locks held by bash/2404:
#0: (cpu_add_remove_lock){--..}, at: [<c0313dea>] mutex_lock+0x1c/0x1f
#1: (workqueue_mutex){--..}, at: [<c0313dea>] mutex_lock+0x1c/0x1f

stack backtrace:
[<c01042e6>] dump_trace+0x64/0x1cd
[<c0104461>] show_trace_log_lvl+0x12/0x25
[<c0104a08>] show_trace+0xd/0x10
[<c0104a4f>] dump_stack+0x19/0x1b
[<c013bbf7>] print_circular_bug_tail+0x59/0x64
[<c013c52c>] __lock_acquire+0x92a/0xaef
[<c013ca5b>] lock_acquire+0x71/0x91
[<c0138202>] down_read+0x28/0x3c
[<c012e6a0>] blocking_notifier_call_chain+0x11/0x2d
[<c014138b>] _cpu_down+0x168/0x1ff
[<c01415fc>] disable_nonboot_cpus+0x9b/0x12f
[<c0146bd3>] prepare_processes+0xf/0x73
[<c0146e78>] pm_suspend_disk+0xa/0x11c
[<c01460d4>] enter_state+0x5a/0x185
[<c0146285>] state_store+0x86/0x9c
[<c01ad6dc>] subsys_attr_store+0x20/0x25
[<c01ad7df>] sysfs_write_file+0xaa/0xd3
[<c0177189>] vfs_write+0xcd/0x179
[<c0177834>] sys_write+0x3b/0x71
[<c0103241>] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d
Leftover inexact backtrace:
=======================

config & dmesg http://www.stardust.webpages.pl/files/tbf/euridica/2.6.19-rc1-mm1/

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)
Michal Piotrowski
2006-10-10 14:04:22 UTC
Permalink
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
Kernel 2.6.19-rc1-mm1 + Neil's avoid_lockdep_warning_in_md.patch
(http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0642.html)
(I'll try to reproduce this without Neil's patch).
I can't reproduce this without Neil's patch.
Post by Michal Piotrowski
echo shutdown > /sys/power/disk; echo disk > /sys/power/state
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.19-rc1-mm1 #4
-------------------------------------------------------
((cpu_chain).rwsem){..--}, at: [<c012e6a0>]
blocking_notifier_call_chain+0x11/0x2d
(workqueue_mutex){--..}, at: [<c0313dea>] mutex_lock+0x1c/0x1f
which lock already depends on the new lock.
[<c013a4d0>] add_lock_to_list+0x5c/0x7a
[<c013c5f5>] __lock_acquire+0x9f3/0xaef
[<c013ca5b>] lock_acquire+0x71/0x91
[<c0313baf>] __mutex_lock_slowpath+0xd2/0x2f1
[<c0313dea>] mutex_lock+0x1c/0x1f
[<c0131767>] workqueue_cpu_callback+0x109/0x1ff
[<c012e30f>] notifier_call_chain+0x20/0x31
[<c012e6ac>] blocking_notifier_call_chain+0x1d/0x2d
[<c014126b>] _cpu_down+0x48/0x1ff
[<c01415fc>] disable_nonboot_cpus+0x9b/0x12f
[<c0146bd3>] prepare_processes+0xf/0x73
[<c0146e78>] pm_suspend_disk+0xa/0x11c
[<c01460d4>] enter_state+0x5a/0x185
[<c0146285>] state_store+0x86/0x9c
[<c01ad6dc>] subsys_attr_store+0x20/0x25
[<c01ad7df>] sysfs_write_file+0xaa/0xd3
[<c0177189>] vfs_write+0xcd/0x179
[<c0177834>] sys_write+0x3b/0x71
[<c0103241>] sysenter_past_esp+0x56/0x8d
[<ffffffff>] 0xffffffff
[<c013bbce>] print_circular_bug_tail+0x30/0x64
[<c013c52c>] __lock_acquire+0x92a/0xaef
[<c013ca5b>] lock_acquire+0x71/0x91
[<c0138202>] down_read+0x28/0x3c
[<c012e6a0>] blocking_notifier_call_chain+0x11/0x2d
[<c014138b>] _cpu_down+0x168/0x1ff
[<c01415fc>] disable_nonboot_cpus+0x9b/0x12f
[<c0146bd3>] prepare_processes+0xf/0x73
[<c0146e78>] pm_suspend_disk+0xa/0x11c
[<c01460d4>] enter_state+0x5a/0x185
[<c0146285>] state_store+0x86/0x9c
[<c01ad6dc>] subsys_attr_store+0x20/0x25
[<c01ad7df>] sysfs_write_file+0xaa/0xd3
[<c0177189>] vfs_write+0xcd/0x179
[<c0177834>] sys_write+0x3b/0x71
[<c0103241>] sysenter_past_esp+0x56/0x8d
[<ffffffff>] 0xffffffff
#0: (cpu_add_remove_lock){--..}, at: [<c0313dea>] mutex_lock+0x1c/0x1f
#1: (workqueue_mutex){--..}, at: [<c0313dea>] mutex_lock+0x1c/0x1f
[<c01042e6>] dump_trace+0x64/0x1cd
[<c0104461>] show_trace_log_lvl+0x12/0x25
[<c0104a08>] show_trace+0xd/0x10
[<c0104a4f>] dump_stack+0x19/0x1b
[<c013bbf7>] print_circular_bug_tail+0x59/0x64
[<c013c52c>] __lock_acquire+0x92a/0xaef
[<c013ca5b>] lock_acquire+0x71/0x91
[<c0138202>] down_read+0x28/0x3c
[<c012e6a0>] blocking_notifier_call_chain+0x11/0x2d
[<c014138b>] _cpu_down+0x168/0x1ff
[<c01415fc>] disable_nonboot_cpus+0x9b/0x12f
[<c0146bd3>] prepare_processes+0xf/0x73
[<c0146e78>] pm_suspend_disk+0xa/0x11c
[<c01460d4>] enter_state+0x5a/0x185
[<c0146285>] state_store+0x86/0x9c
[<c01ad6dc>] subsys_attr_store+0x20/0x25
[<c01ad7df>] sysfs_write_file+0xaa/0xd3
[<c0177189>] vfs_write+0xcd/0x179
[<c0177834>] sys_write+0x3b/0x71
[<c0103241>] sysenter_past_esp+0x56/0x8d
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x8d
=======================
config & dmesg http://www.stardust.webpages.pl/files/tbf/euridica/2.6.19-rc1-mm1/
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)
Neil Brown
2006-10-11 05:35:38 UTC
Permalink
Post by Michal Piotrowski
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
Kernel 2.6.19-rc1-mm1 + Neil's avoid_lockdep_warning_in_md.patch
(http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0642.html)
(I'll try to reproduce this without Neil's patch).
I can't reproduce this without Neil's patch.
Despite this circumstantial evidence, I don't see how my patch could
possible have an effect here....

Looking at the code, starting at _cpu_down in the CONFIG_HOTPLUG_CPU
case, the call notifier chain 'cpu_chain' contains
workqueue_cpu_callback which does 'mutex_lock(&workqueue_mutex)' in
the "DOWN_PREPARE" case and mutex_unlock(&workqueue_mutex) in the
DOWN_FAILED and DEAD cases.

blocking_notifier_call_chain is
down_read(&nh->rwsem);
ret = notifier_call_chain(&nh->head, val, v);
up_read(&nh->rwsem);

and so holds ->rwsem while calling the callback.
So the locking sequence ends up as:

down_read(&cpu_chain.rwsem);
mutex_lock(&workqueue_mutex);
up_read(&cpu_chain.rwsem);

down_read(&cpu_chain.rwsem);
mutex_unlock(&workqueue_mutex);
up_read(&workqueue_mutex);

and lockdep doesn't seem to like this. It sees workqueue_mutex
claimed while cpu_chain.rwsem is held. and then it sees
cpu_chain.rwsem claimed while workqueue_mutex is held, which looks a
bit like a class ABBA deadlock.
Of course because it is a 'down_read' rather than a 'down', it isn't
really a dead lock.

I don't know how to tell lockdep to do the right thing, but I'll leave
that up to Ingo et al.

Why it didn't trigger without my patch I cannot imagine. Are you sure
the config was identical (you didn't remove CONFIG_HOTPLUG_CPU or
anything did you?).

NeilBrown
Post by Michal Piotrowski
Post by Michal Piotrowski
echo shutdown > /sys/power/disk; echo disk > /sys/power/state
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.19-rc1-mm1 #4
-------------------------------------------------------
Michal Piotrowski
2006-10-11 10:48:20 UTC
Permalink
Post by Neil Brown
Post by Michal Piotrowski
Post by Michal Piotrowski
Hi,
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
Kernel 2.6.19-rc1-mm1 + Neil's avoid_lockdep_warning_in_md.patch
(http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0642.html)
(I'll try to reproduce this without Neil's patch).
I can't reproduce this without Neil's patch.
Despite this circumstantial evidence, I don't see how my patch could
possible have an effect here....
Looking at the code, starting at _cpu_down in the CONFIG_HOTPLUG_CPU
case, the call notifier chain 'cpu_chain' contains
workqueue_cpu_callback which does 'mutex_lock(&workqueue_mutex)' in
the "DOWN_PREPARE" case and mutex_unlock(&workqueue_mutex) in the
DOWN_FAILED and DEAD cases.
blocking_notifier_call_chain is
down_read(&nh->rwsem);
ret = notifier_call_chain(&nh->head, val, v);
up_read(&nh->rwsem);
and so holds ->rwsem while calling the callback.
down_read(&cpu_chain.rwsem);
mutex_lock(&workqueue_mutex);
up_read(&cpu_chain.rwsem);
down_read(&cpu_chain.rwsem);
mutex_unlock(&workqueue_mutex);
up_read(&workqueue_mutex);
and lockdep doesn't seem to like this. It sees workqueue_mutex
claimed while cpu_chain.rwsem is held. and then it sees
cpu_chain.rwsem claimed while workqueue_mutex is held, which looks a
bit like a class ABBA deadlock.
Of course because it is a 'down_read' rather than a 'down', it isn't
really a dead lock.
I don't know how to tell lockdep to do the right thing, but I'll leave
that up to Ingo et al.
Why it didn't trigger without my patch I cannot imagine. Are you sure
the config was identical (you didn't remove CONFIG_HOTPLUG_CPU or
anything did you?).
No, I didn't remove CONFIG_HOTPLUG_CPU or anything else.

I didn't do enough testing - only a few hibernatins.
Post by Neil Brown
NeilBrown
Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)
Arjan van de Ven
2006-10-11 11:23:06 UTC
Permalink
Post by Neil Brown
blocking_notifier_call_chain is
down_read(&nh->rwsem);
ret = notifier_call_chain(&nh->head, val, v);
up_read(&nh->rwsem);
and so holds ->rwsem while calling the callback.
down_read(&cpu_chain.rwsem);
mutex_lock(&workqueue_mutex);
up_read(&cpu_chain.rwsem);
down_read(&cpu_chain.rwsem);
mutex_unlock(&workqueue_mutex);
up_read(&workqueue_mutex);
and lockdep doesn't seem to like this. It sees workqueue_mutex
claimed while cpu_chain.rwsem is held. and then it sees
cpu_chain.rwsem claimed while workqueue_mutex is held, which looks a
bit like a class ABBA deadlock.
Of course because it is a 'down_read' rather than a 'down', it isn't
really a dead lock.
ok can you explain to me why "down_read" doesn't make this a deadlock
while "down" would make it a deadlock? I have trouble following your
reasoning.....

(remember that rwsems are strictly fair)
Neil Brown
2006-10-11 13:08:21 UTC
Permalink
Post by Arjan van de Ven
Post by Neil Brown
blocking_notifier_call_chain is
down_read(&nh->rwsem);
ret = notifier_call_chain(&nh->head, val, v);
up_read(&nh->rwsem);
and so holds ->rwsem while calling the callback.
down_read(&cpu_chain.rwsem);
mutex_lock(&workqueue_mutex);
up_read(&cpu_chain.rwsem);
down_read(&cpu_chain.rwsem);
mutex_unlock(&workqueue_mutex);
up_read(&workqueue_mutex);
and lockdep doesn't seem to like this. It sees workqueue_mutex
claimed while cpu_chain.rwsem is held. and then it sees
cpu_chain.rwsem claimed while workqueue_mutex is held, which looks a
bit like a class ABBA deadlock.
Of course because it is a 'down_read' rather than a 'down', it isn't
really a dead lock.
ok can you explain to me why "down_read" doesn't make this a deadlock
while "down" would make it a deadlock? I have trouble following your
reasoning.....
(remember that rwsems are strictly fair)
I see your point.

While thread A holds just workqueue_mutex,
thread B takes cpu_chain.rwsem for read then tries to take
workqueue_mutex and blocks.
Now thread C tries to get a write lock on cpu_chain.rwsem and blocks
as well.
Finally thread A moves on to try to get a read lock on cpu_chain.rwsem
and this blocks because thread C is waiting for a write lock.

So A waits on B and C, C waits on B, B waits on A.
Deadlock.

I guess _cpu_down should
down_read(&cpu_chain.rwsem);
and then call notifier_call_chain multiple times. I wonder if that
would be safe.

Who do we blame this on? Are you still the cpu-hot-plug guy Rusty?

NeilBrown
Rusty Russell
2006-10-11 13:32:03 UTC
Permalink
Post by Neil Brown
Who do we blame this on? Are you still the cpu-hot-plug guy Rusty?
Well, I wasn't the one who introduced locking into notifier chains,
which is the cause from my reading of your explanation...

Rusty.
--
Help! Save Australia from the worst of the DMCA: http://linux.org.au/law
Andrew Morton
2006-10-11 16:39:20 UTC
Permalink
On Wed, 11 Oct 2006 23:08:21 +1000
Post by Neil Brown
Post by Arjan van de Ven
Post by Neil Brown
blocking_notifier_call_chain is
down_read(&nh->rwsem);
ret = notifier_call_chain(&nh->head, val, v);
up_read(&nh->rwsem);
and so holds ->rwsem while calling the callback.
down_read(&cpu_chain.rwsem);
mutex_lock(&workqueue_mutex);
up_read(&cpu_chain.rwsem);
down_read(&cpu_chain.rwsem);
mutex_unlock(&workqueue_mutex);
up_read(&workqueue_mutex);
and lockdep doesn't seem to like this. It sees workqueue_mutex
claimed while cpu_chain.rwsem is held. and then it sees
cpu_chain.rwsem claimed while workqueue_mutex is held, which looks a
bit like a class ABBA deadlock.
Of course because it is a 'down_read' rather than a 'down', it isn't
really a dead lock.
ok can you explain to me why "down_read" doesn't make this a deadlock
while "down" would make it a deadlock? I have trouble following your
reasoning.....
(remember that rwsems are strictly fair)
I see your point.
While thread A holds just workqueue_mutex,
thread B takes cpu_chain.rwsem for read then tries to take
workqueue_mutex and blocks.
Now thread C tries to get a write lock on cpu_chain.rwsem and blocks
as well.
Finally thread A moves on to try to get a read lock on cpu_chain.rwsem
and this blocks because thread C is waiting for a write lock.
So A waits on B and C, C waits on B, B waits on A.
Deadlock.
Except the entire operation is serialised by the the two top-level callers
(cpu_up() and cpu_down()) taking mutex_lock(&cpu_add_remove_lock). Can
lockdep be taught about that?
Post by Neil Brown
Who do we blame this on? Are you still the cpu-hot-plug guy Rusty?
It's fun blaming Rusty for stuff, but he can dodge this one with
more-than-usual ease, I'm afraid.
Neil Brown
2006-10-11 23:46:50 UTC
Permalink
Post by Andrew Morton
Post by Neil Brown
So A waits on B and C, C waits on B, B waits on A.
Deadlock.
Except the entire operation is serialised by the the two top-level callers
(cpu_up() and cpu_down()) taking mutex_lock(&cpu_add_remove_lock). Can
lockdep be taught about that?
So you are saying that even though we have locking sequences
A -> B and B -> A,
that cannot - in this case - cause a deadlock as both sequences only
ever happen under a third exclusive lock C,
So when lockdep records a lock-dependency A -> B, it should also
record a list of locks that are *always* held when that dependency
occurs.
Then, when it finds a new dependency and does loop detection, it
should exclude from the path any dependency which is always under a
lock that some other dependency in the path is always under.
Also, loop checking as to happen both when a new dependency is found,
and when a lock is removed from the set of locks that protect the
dependency.

Recording stack traces might be interesting as you potentially need to
record a trace for ever minimal set of locks that the dependency is
created under.

So the ball is back in Ingo's court ?

Though it is odd that the warning doesn't trigger every time....
Post by Andrew Morton
Post by Neil Brown
Who do we blame this on? Are you still the cpu-hot-plug guy Rusty?
It's fun blaming Rusty for stuff, but he can dodge this one with
more-than-usual ease, I'm afraid.
In that case, I was never dreaming of blaming him, only letting him
know that there is a lock-dep warning in code that he might be seen as
responsible for - just in case anyone does blame him.
Yes. That's what I was doing. Definitely.

NeilBrown
Arjan van de Ven
2006-10-12 06:51:12 UTC
Permalink
Post by Neil Brown
Post by Andrew Morton
Post by Neil Brown
So A waits on B and C, C waits on B, B waits on A.
Deadlock.
Except the entire operation is serialised by the the two top-level callers
(cpu_up() and cpu_down()) taking mutex_lock(&cpu_add_remove_lock). Can
lockdep be taught about that?
So you are saying that even though we have locking sequences
A -> B and B -> A,
that cannot - in this case - cause a deadlock as both sequences only
ever happen under a third exclusive lock C,
So when lockdep records a lock-dependency A -> B, it should also
record a list of locks that are *always* held when that dependency
occurs.
in that case... why are A and B there *at all* ?
Neil Brown
2006-10-12 07:53:11 UTC
Permalink
Post by Arjan van de Ven
Post by Neil Brown
Post by Andrew Morton
Post by Neil Brown
So A waits on B and C, C waits on B, B waits on A.
Deadlock.
Except the entire operation is serialised by the the two top-level callers
(cpu_up() and cpu_down()) taking mutex_lock(&cpu_add_remove_lock). Can
lockdep be taught about that?
So you are saying that even though we have locking sequences
A -> B and B -> A,
that cannot - in this case - cause a deadlock as both sequences only
ever happen under a third exclusive lock C,
So when lockdep records a lock-dependency A -> B, it should also
record a list of locks that are *always* held when that dependency
occurs.
in that case... why are A and B there *at all* ?
:-)

Obviously because someone out-side of C might want to interact with
the data protected by A or B.

But wait... what are the implications of that.

The data managed by B (where B == cpu_chain.rwsem) is a list of
notifiers. Each notifier is called with CPU_DOWN_PREPARE and then
will be called with either CPU_DOWN_FAILED or CPU_DEAD.

Now because we release and reclaim B it is possible for someone to add
or remove a notifier. Either of these event means that the relevant
notifier will get called with one of these but not the other. It
seems likely that a notifier will be written to assume this
bracketing.
In fact workqueue_cpu_callback (which is such a notifier) does
mutex_lock(&workqueue_mutex);
in CPU_DOWN_PREPARE and
mutex_unlock(&workqueue_mutex);
in CPU_DOWN_FAILED and CPU_DEAD.

If it got registered in the middle of _cpu_down, then it would unlock
a mutex that wasn't locked. Now I suspect it cannot be registered
while _cpu_down is active as it is only registered once, very early.
But it certainly does raise the question of why all this locking
is needed....

I think I'm in favour of the following. It should clean up the
lockdep warning and seems to make sense.

NeilBrown

Signed-off-by: Neil Brown <***@suse.de>

### Diffstat output
./kernel/cpu.c | 20 ++++++++++++--------
1 file changed, 12 insertions(+), 8 deletions(-)

diff .prev/kernel/cpu.c ./kernel/cpu.c
--- .prev/kernel/cpu.c 2006-10-12 17:46:37.000000000 +1000
+++ ./kernel/cpu.c 2006-10-12 17:51:50.000000000 +1000
@@ -126,9 +126,11 @@ static int _cpu_down(unsigned int cpu)
if (!cpu_online(cpu))
return -EINVAL;

- err = blocking_notifier_call_chain(&cpu_chain, CPU_DOWN_PREPARE,
+ down_read(&cpu_chain.rwsem);
+ err = raw_notifier_call_chain(&cpu_chain, CPU_DOWN_PREPARE,
(void *)(long)cpu);
if (err == NOTIFY_BAD) {
+ up_read(&cpu_chain.rwsem);
printk("%s: attempt to take down CPU %u failed\n",
__FUNCTION__, cpu);
return -EINVAL;
@@ -146,11 +148,11 @@ static int _cpu_down(unsigned int cpu)

if (IS_ERR(p)) {
/* CPU didn't die: tell everyone. Can't complain. */
- if (blocking_notifier_call_chain(&cpu_chain, CPU_DOWN_FAILED,
+ if (raw_notifier_call_chain(&cpu_chain, CPU_DOWN_FAILED,
(void *)(long)cpu) == NOTIFY_BAD)
BUG();

- err = PTR_ERR(p);
+ err = PTR_ERR(p);
goto out_allowed;
}

@@ -169,7 +171,7 @@ static int _cpu_down(unsigned int cpu)
put_cpu();

/* CPU is completely dead: tell everyone. Too late to complain. */
- if (blocking_notifier_call_chain(&cpu_chain, CPU_DEAD,
+ if (raw_notifier_call_chain(&cpu_chain, CPU_DEAD,
(void *)(long)cpu) == NOTIFY_BAD)
BUG();

@@ -178,6 +180,7 @@ static int _cpu_down(unsigned int cpu)
out_thread:
err = kthread_stop(p);
out_allowed:
+ up_read(&cpu_chain.rwsem);
set_cpus_allowed(current, old_allowed);
return err;
}
@@ -206,7 +209,8 @@ static int __devinit _cpu_up(unsigned in
if (cpu_online(cpu) || !cpu_present(cpu))
return -EINVAL;

- ret = blocking_notifier_call_chain(&cpu_chain, CPU_UP_PREPARE, hcpu);
+ down_read(&cpu_chain.rwsem);
+ ret = raw_notifier_call_chain(&cpu_chain, CPU_UP_PREPARE, hcpu);
if (ret == NOTIFY_BAD) {
printk("%s: attempt to bring up CPU %u failed\n",
__FUNCTION__, cpu);
@@ -223,13 +227,13 @@ static int __devinit _cpu_up(unsigned in
BUG_ON(!cpu_online(cpu));

/* Now call notifier in preparation. */
- blocking_notifier_call_chain(&cpu_chain, CPU_ONLINE, hcpu);
+ raw_notifier_call_chain(&cpu_chain, CPU_ONLINE, hcpu);

out_notify:
if (ret != 0)
- blocking_notifier_call_chain(&cpu_chain,
+ raw_notifier_call_chain(&cpu_chain,
CPU_UP_CANCELED, hcpu);
-
+ up_read(&cpu_chain.rwsem);
return ret;
}
Andrew Morton
2006-10-12 08:04:06 UTC
Permalink
On Thu, 12 Oct 2006 17:53:11 +1000
Post by Neil Brown
I think I'm in favour of the following.
Would be simpler to take cpu_add_remove_lock in
[un]register_cpu_notifier(). I actually thought I'd done that to fix this
bug but must have forgotten or lost the patch :(

We can then convert all the notifier chains in there to raw_*.
Neil Brown
2006-10-13 04:49:39 UTC
Permalink
Post by Andrew Morton
On Thu, 12 Oct 2006 17:53:11 +1000
Post by Neil Brown
I think I'm in favour of the following.
Would be simpler to take cpu_add_remove_lock in
[un]register_cpu_notifier(). I actually thought I'd done that to fix this
bug but must have forgotten or lost the patch :(
We can then convert all the notifier chains in there to raw_*.
The two philosophers gaped at him.

"Bloody hell," said Majikthise, "now that is what I call
thinking. Here Vroomfondel, why do we never think of things like
that?"

"Dunno," said Vroomfondel in an awed whisper, "think our brains must
be too highly trained Majikthise."

[ http://flag.blackened.net/dinsdale/dna/book1.html ]

I guess you'll be wanting this then, unless you have done it already.

NeilBrown

-----------
Subject: Convert cpu hotplug notifiers to use raw_notifier instead of blocking_notifier

The use of blocking notifier by _cpu_up and _cpu_down in cpu.c
has two problem.

1/ An interaction with the workqueue notifier causes lockdep to
spit a warning.
2/ A notifier could conceivable be added or removed while _cpu_up or
_cpu_down are in process. As each notifier is called twice
(prepare then commit/abort) this could be unhealthy.

To fix to we simply take cpu_add_remove_lock while adding
or removing notifiers to/from the list.

This makes the 'blocking' usage unnecessary as all accesses to
cpu_chain are now protected by cpu_add_remove_lock. So
change "blocking" to "raw" in all relevant places.
This fixes 1.

Credit: Andrew Morton
Cc: ***@rustcorp.com.au (maintainer)
Cc: Michal Piotrowski <***@gmail.com> (reporter)
Signed-off-by: Neil Brown <***@suse.de>

### Diffstat output
./kernel/cpu.c | 24 +++++++++++++++---------
1 file changed, 15 insertions(+), 9 deletions(-)

diff .prev/kernel/cpu.c ./kernel/cpu.c
--- .prev/kernel/cpu.c 2006-10-13 14:30:56.000000000 +1000
+++ ./kernel/cpu.c 2006-10-13 14:33:49.000000000 +1000
@@ -19,7 +19,7 @@
static DEFINE_MUTEX(cpu_add_remove_lock);
static DEFINE_MUTEX(cpu_bitmask_lock);

-static __cpuinitdata BLOCKING_NOTIFIER_HEAD(cpu_chain);
+static __cpuinitdata RAW_NOTIFIER_HEAD(cpu_chain);

/* If set, cpu_up and cpu_down will return -EBUSY and do nothing.
* Should always be manipulated under cpu_add_remove_lock
@@ -68,7 +68,11 @@ EXPORT_SYMBOL_GPL(unlock_cpu_hotplug);
/* Need to know about CPUs going up/down? */
int __cpuinit register_cpu_notifier(struct notifier_block *nb)
{
- return blocking_notifier_chain_register(&cpu_chain, nb);
+ int ret;
+ mutex_lock(&cpu_add_remove_lock);
+ ret = raw_notifier_chain_register(&cpu_chain, nb);
+ mutex_unlock(&cpu_add_remove_lock);
+ return ret;
}

#ifdef CONFIG_HOTPLUG_CPU
@@ -77,7 +81,9 @@ EXPORT_SYMBOL(register_cpu_notifier);

void unregister_cpu_notifier(struct notifier_block *nb)
{
- blocking_notifier_chain_unregister(&cpu_chain, nb);
+ mutex_lock(&cpu_add_remove_lock);
+ raw_notifier_chain_unregister(&cpu_chain, nb);
+ mutex_unlock(&cpu_add_remove_lock);
}
EXPORT_SYMBOL(unregister_cpu_notifier);

@@ -126,7 +132,7 @@ static int _cpu_down(unsigned int cpu)
if (!cpu_online(cpu))
return -EINVAL;

- err = blocking_notifier_call_chain(&cpu_chain, CPU_DOWN_PREPARE,
+ err = raw_notifier_call_chain(&cpu_chain, CPU_DOWN_PREPARE,
(void *)(long)cpu);
if (err == NOTIFY_BAD) {
printk("%s: attempt to take down CPU %u failed\n",
@@ -146,7 +152,7 @@ static int _cpu_down(unsigned int cpu)

if (IS_ERR(p)) {
/* CPU didn't die: tell everyone. Can't complain. */
- if (blocking_notifier_call_chain(&cpu_chain, CPU_DOWN_FAILED,
+ if (raw_notifier_call_chain(&cpu_chain, CPU_DOWN_FAILED,
(void *)(long)cpu) == NOTIFY_BAD)
BUG();

@@ -169,7 +175,7 @@ static int _cpu_down(unsigned int cpu)
put_cpu();

/* CPU is completely dead: tell everyone. Too late to complain. */
- if (blocking_notifier_call_chain(&cpu_chain, CPU_DEAD,
+ if (raw_notifier_call_chain(&cpu_chain, CPU_DEAD,
(void *)(long)cpu) == NOTIFY_BAD)
BUG();

@@ -206,7 +212,7 @@ static int __devinit _cpu_up(unsigned in
if (cpu_online(cpu) || !cpu_present(cpu))
return -EINVAL;

- ret = blocking_notifier_call_chain(&cpu_chain, CPU_UP_PREPARE, hcpu);
+ ret = raw_notifier_call_chain(&cpu_chain, CPU_UP_PREPARE, hcpu);
if (ret == NOTIFY_BAD) {
printk("%s: attempt to bring up CPU %u failed\n",
__FUNCTION__, cpu);
@@ -223,11 +229,11 @@ static int __devinit _cpu_up(unsigned in
BUG_ON(!cpu_online(cpu));

/* Now call notifier in preparation. */
- blocking_notifier_call_chain(&cpu_chain, CPU_ONLINE, hcpu);
+ raw_notifier_call_chain(&cpu_chain, CPU_ONLINE, hcpu);

out_notify:
if (ret != 0)
- blocking_notifier_call_chain(&cpu_chain,
+ raw_notifier_call_chain(&cpu_chain,
CPU_UP_CANCELED, hcpu);

return ret;
Dave Kleikamp
2006-10-10 15:47:49 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
I'm seeing an exception in filp_close(), called from sys_dup2(). I have
only seen it when I try to start up a java application (Lotus
Workplace).

I suspect that it may be related to the fdtable work, but I haven't
investigated it too closely.
Post by Andrew Morton
+fdtable-delete-pointless-code-in-dup_fd.patch
+fdtable-make-fdarray-and-fdsets-equal-in-size.patch
+fdtable-remove-the-free_files-field.patch
+fdtable-implement-new-pagesize-based-fdtable-allocator.patch
Redo the fdtable code.
BUG: unable to handle kernel paging request at virtual address 3237304a
printing eip:
c015636f
*pde = 00000000
Oops: 0000 [#1]
PREEMPT
last sysfs file: /block/hda/hda5/stat
Modules linked in: irda crc_ccitt tun airo e1000 pcmcia yenta_socket rsrc_nonstatic pcmcia_core radeon rtc ntfs jfs
CPU: 0
EIP: 0060:[<c015636f>] Not tainted VLI
EFLAGS: 00010206 (2.6.19-rc1-mm1 #1)
EIP is at filp_close+0xc/0x5e
eax: 00000000 ebx: 32373036 ecx: cbc66000 edx: 00000000
esi: 00000001 edi: dff0cc80 ebp: cbc67f8c esp: cbc67f80
ds: 007b es: 007b ss: 0068
Process java (pid: 12945, ti=cbc66000 task=ce9d6b00 task.ti=cbc66000)
Stack: 0000020c 00000001 cef765c0 cbc67fb4 c01623ed 32373036 dff0cc80 dff0cca4
32373036 dff0cc80 00000007 00000000 fffffff4 cbc66000 c0102e05 00000007
0000020c 00000000 00000000 fffffff4 bf966c14 0000003f 0000007b c03b007b
Call Trace:
[<c01623ed>] sys_dup2+0xdb/0x10f
[<c0102e05>] sysenter_past_esp+0x56/0x79
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x79
Leftover inexact backtrace:
=======================
Code: 26 00 8b 43 04 8b 40 0c 0f b3 30 3b 73 4c 73 03 89 73 4c 89 f8 e8 c0 51 26 00 5b 5e 5f 5d c3 55 89 e5 57 8b 7d 0c 56 53 8b 5d 08 <8b> 43 14 85 c0 75 0f 68 a1 e8 3f c0 31 f6 e8 54 35 fc ff 59 eb
EIP: [<c015636f>] filp_close+0xc/0x5e SS:ESP 0068:cbc67f80
<1>BUG: unable to handle kernel paging request at virtual address 02404e7c
printing eip:
c015636f
*pde = 00000000
Oops: 0000 [#2]
PREEMPT
last sysfs file: /block/hda/hda5/stat
Modules linked in: irda crc_ccitt tun airo e1000 pcmcia yenta_socket rsrc_nonstatic pcmcia_core radeon rtc ntfs jfs
CPU: 0
EIP: 0060:[<c015636f>] Not tainted VLI
EFLAGS: 00010202 (2.6.19-rc1-mm1 #1)
EIP is at filp_close+0xc/0x5e
eax: 00000000 ebx: 02404e68 ecx: cc636000 edx: 00000000
esi: 00000001 edi: dfcfecc0 ebp: cc637f8c esp: cc637f80
ds: 007b es: 007b ss: 0068
Process java (pid: 13593, ti=cc636000 task=ce97f550 task.ti=cc636000)
Stack: 0000020c 00000001 cbc56b80 cc637fb4 c01623ed 02404e68 dfcfecc0 dfcfece4
02404e68 dfcfecc0 00000007 00000000 fffffff4 cc636000 c0102e05 00000007
0000020c 00000000 00000000 fffffff4 bfc5f704 0000003f 0000007b c03b007b
Call Trace:
[<c01623ed>] sys_dup2+0xdb/0x10f
[<c0102e05>] sysenter_past_esp+0x56/0x79
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x79
Leftover inexact backtrace:
=======================
Code: 26 00 8b 43 04 8b 40 0c 0f b3 30 3b 73 4c 73 03 89 73 4c 89 f8 e8 c0 51 26 00 5b 5e 5f 5d c3 55 89 e5 57 8b 7d 0c 56 53 8b 5d 08 <8b> 43 14 85 c0 75 0f 68 a1 e8 3f c0 31 f6 e8 54 35 fc ff 59 eb
EIP: [<c015636f>] filp_close+0xc/0x5e SS:ESP 0068:cc637f80
<1>BUG: unable to handle kernel NULL pointer dereference at virtual address 00000127
printing eip:
c01874ca
*pde = 00000000
Oops: 0000 [#3]
PREEMPT
last sysfs file: /block/hda/hda5/stat
Modules linked in: irda crc_ccitt tun airo e1000 pcmcia yenta_socket rsrc_nonstatic pcmcia_core radeon rtc ntfs jfs
CPU: 0
EIP: 0060:[<c01874ca>] Not tainted VLI
EFLAGS: 00210246 (2.6.19-rc1-mm1 #1)
EIP is at dnotify_flush+0xf/0x73
eax: c8892c4b ebx: b7f96e4a ecx: cab2c000 edx: b7f96e4a
esi: 000000ff edi: c148a280 ebp: cab2df70 esp: cab2df64
ds: 007b es: 007b ss: 0068
Process java (pid: 13942, ti=cab2c000 task=d572d4d0 task.ti=cab2c000)
Stack: b7f96e4a 00000000 c148a280 cab2df8c c01563a6 b7f96e4a c148a280 0000020c
00000001 cef5dec0 cab2dfb4 c01623ed b7f96e4a c148a280 c148a2a4 b7f96e4a
c148a280 00000007 00000000 fffffff4 cab2c000 c0102e05 00000007 0000020c
Call Trace:
[<c01563a6>] filp_close+0x43/0x5e
[<c01623ed>] sys_dup2+0xdb/0x10f
[<c0102e05>] sysenter_past_esp+0x56/0x79
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x79
Leftover inexact backtrace:
=======================
Code: ff ff 53 e8 d3 00 fe ff 83 c4 0c eb 07 89 f0 e8 6b 40 23 00 8d 65 f4 5b 5e 5f 5d c3 55 89 e5 8b 55 08 57 56 53 8b 42 08 8b 70 30 <0f> b7 46 28 25 00 f0 00 00 3d 00 40 00 00 75 4c 8d 7e 70 89 f8
EIP: [<c01874ca>] dnotify_flush+0xf/0x73 SS:ESP 0068:cab2df64
--
David Kleikamp
IBM Linux Technology Center
Dave Kleikamp
2006-10-10 22:07:04 UTC
Permalink
Post by Dave Kleikamp
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
I'm seeing an exception in filp_close(), called from sys_dup2(). I have
only seen it when I try to start up a java application (Lotus
Workplace).
I suspect that it may be related to the fdtable work, but I haven't
investigated it too closely.
Still don't know exactly what's going on here. In case it helps, this
is the call to dup2() from strace output:

1419 open("/dev/null", O_RDWR) = 7
1419 getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=1024}) = 0
1419 dup2(7, 524) = 524
1419 dup2(7, 525 <unfinished ...>
Post by Dave Kleikamp
Post by Andrew Morton
+fdtable-delete-pointless-code-in-dup_fd.patch
+fdtable-make-fdarray-and-fdsets-equal-in-size.patch
+fdtable-remove-the-free_files-field.patch
+fdtable-implement-new-pagesize-based-fdtable-allocator.patch
Redo the fdtable code.
BUG: unable to handle kernel paging request at virtual address 3237304a
c015636f
*pde = 00000000
Oops: 0000 [#1]
PREEMPT
last sysfs file: /block/hda/hda5/stat
Modules linked in: irda crc_ccitt tun airo e1000 pcmcia yenta_socket rsrc_nonstatic pcmcia_core radeon rtc ntfs jfs
CPU: 0
EIP: 0060:[<c015636f>] Not tainted VLI
EFLAGS: 00010206 (2.6.19-rc1-mm1 #1)
EIP is at filp_close+0xc/0x5e
eax: 00000000 ebx: 32373036 ecx: cbc66000 edx: 00000000
esi: 00000001 edi: dff0cc80 ebp: cbc67f8c esp: cbc67f80
ds: 007b es: 007b ss: 0068
Process java (pid: 12945, ti=cbc66000 task=ce9d6b00 task.ti=cbc66000)
Stack: 0000020c 00000001 cef765c0 cbc67fb4 c01623ed 32373036 dff0cc80 dff0cca4
32373036 dff0cc80 00000007 00000000 fffffff4 cbc66000 c0102e05 00000007
0000020c 00000000 00000000 fffffff4 bf966c14 0000003f 0000007b c03b007b
[<c01623ed>] sys_dup2+0xdb/0x10f
[<c0102e05>] sysenter_past_esp+0x56/0x79
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x79
=======================
Code: 26 00 8b 43 04 8b 40 0c 0f b3 30 3b 73 4c 73 03 89 73 4c 89 f8 e8 c0 51 26 00 5b 5e 5f 5d c3 55 89 e5 57 8b 7d 0c 56 53 8b 5d 08 <8b> 43 14 85 c0 75 0f 68 a1 e8 3f c0 31 f6 e8 54 35 fc ff 59 eb
EIP: [<c015636f>] filp_close+0xc/0x5e SS:ESP 0068:cbc67f80
<1>BUG: unable to handle kernel paging request at virtual address 02404e7c
c015636f
*pde = 00000000
Oops: 0000 [#2]
PREEMPT
last sysfs file: /block/hda/hda5/stat
Modules linked in: irda crc_ccitt tun airo e1000 pcmcia yenta_socket rsrc_nonstatic pcmcia_core radeon rtc ntfs jfs
CPU: 0
EIP: 0060:[<c015636f>] Not tainted VLI
EFLAGS: 00010202 (2.6.19-rc1-mm1 #1)
EIP is at filp_close+0xc/0x5e
eax: 00000000 ebx: 02404e68 ecx: cc636000 edx: 00000000
esi: 00000001 edi: dfcfecc0 ebp: cc637f8c esp: cc637f80
ds: 007b es: 007b ss: 0068
Process java (pid: 13593, ti=cc636000 task=ce97f550 task.ti=cc636000)
Stack: 0000020c 00000001 cbc56b80 cc637fb4 c01623ed 02404e68 dfcfecc0 dfcfece4
02404e68 dfcfecc0 00000007 00000000 fffffff4 cc636000 c0102e05 00000007
0000020c 00000000 00000000 fffffff4 bfc5f704 0000003f 0000007b c03b007b
[<c01623ed>] sys_dup2+0xdb/0x10f
[<c0102e05>] sysenter_past_esp+0x56/0x79
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x79
=======================
Code: 26 00 8b 43 04 8b 40 0c 0f b3 30 3b 73 4c 73 03 89 73 4c 89 f8 e8 c0 51 26 00 5b 5e 5f 5d c3 55 89 e5 57 8b 7d 0c 56 53 8b 5d 08 <8b> 43 14 85 c0 75 0f 68 a1 e8 3f c0 31 f6 e8 54 35 fc ff 59 eb
EIP: [<c015636f>] filp_close+0xc/0x5e SS:ESP 0068:cc637f80
<1>BUG: unable to handle kernel NULL pointer dereference at virtual address 00000127
c01874ca
*pde = 00000000
Oops: 0000 [#3]
PREEMPT
last sysfs file: /block/hda/hda5/stat
Modules linked in: irda crc_ccitt tun airo e1000 pcmcia yenta_socket rsrc_nonstatic pcmcia_core radeon rtc ntfs jfs
CPU: 0
EIP: 0060:[<c01874ca>] Not tainted VLI
EFLAGS: 00210246 (2.6.19-rc1-mm1 #1)
EIP is at dnotify_flush+0xf/0x73
eax: c8892c4b ebx: b7f96e4a ecx: cab2c000 edx: b7f96e4a
esi: 000000ff edi: c148a280 ebp: cab2df70 esp: cab2df64
ds: 007b es: 007b ss: 0068
Process java (pid: 13942, ti=cab2c000 task=d572d4d0 task.ti=cab2c000)
Stack: b7f96e4a 00000000 c148a280 cab2df8c c01563a6 b7f96e4a c148a280 0000020c
00000001 cef5dec0 cab2dfb4 c01623ed b7f96e4a c148a280 c148a2a4 b7f96e4a
c148a280 00000007 00000000 fffffff4 cab2c000 c0102e05 00000007 0000020c
[<c01563a6>] filp_close+0x43/0x5e
[<c01623ed>] sys_dup2+0xdb/0x10f
[<c0102e05>] sysenter_past_esp+0x56/0x79
DWARF2 unwinder stuck at sysenter_past_esp+0x56/0x79
=======================
Code: ff ff 53 e8 d3 00 fe ff 83 c4 0c eb 07 89 f0 e8 6b 40 23 00 8d 65 f4 5b 5e 5f 5d c3 55 89 e5 8b 55 08 57 56 53 8b 42 08 8b 70 30 <0f> b7 46 28 25 00 f0 00 00 3d 00 40 00 00 75 4c 8d 7e 70 89 f8
EIP: [<c01874ca>] dnotify_flush+0xf/0x73 SS:ESP 0068:cab2df64
--
David Kleikamp
IBM Linux Technology Center
Vadim Lobanov
2006-10-10 22:14:35 UTC
Permalink
Post by Dave Kleikamp
Still don't know exactly what's going on here. In case it helps, this
1419 open("/dev/null", O_RDWR) = 7
1419 getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=1024}) = 0
1419 dup2(7, 524) = 524
1419 dup2(7, 525 <unfinished ...>
Thanks for the data point.

Hmmm... looks as if the likely sequence of events was:
create embedded fdtable
extend fdtable, allocate external data to handle fd = 524
try to extend fdtable again, crash.

Seems as if alloc_fdtable() or copy_fdtable() are to blame, but the code logic
seems to be identical. Hmmmm.

-- Vadim Lobanov
Vadim Lobanov
2006-10-10 22:38:36 UTC
Permalink
Post by Dave Kleikamp
Post by Dave Kleikamp
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc
1/2.6.19-rc1-mm1/
I'm seeing an exception in filp_close(), called from sys_dup2(). I have
only seen it when I try to start up a java application (Lotus
Workplace).
I suspect that it may be related to the fdtable work, but I haven't
investigated it too closely.
Still don't know exactly what's going on here. In case it helps, this
1419 open("/dev/null", O_RDWR) = 7
1419 getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=1024}) = 0
1419 dup2(7, 524) = 524
1419 dup2(7, 525 <unfinished ...>
Post by Dave Kleikamp
Post by Andrew Morton
+fdtable-delete-pointless-code-in-dup_fd.patch
+fdtable-make-fdarray-and-fdsets-equal-in-size.patch
+fdtable-remove-the-free_files-field.patch
+fdtable-implement-new-pagesize-based-fdtable-allocator.patch
Redo the fdtable code.
D'oh!!! Everybody who hit this bug can feel free to call me a moron now! (And
Andrew will probably take me up on that offer, for all the residual flak he
caught. :)) The problem is in the following logic:
+ nr++;
+ nr /= (PAGE_SIZE / 4 / sizeof(struct file *));
+ nr = roundup_pow_of_two(nr);
+ nr *= (PAGE_SIZE / 4 / sizeof(struct file *));
+ if (nr > NR_OPEN)
+ nr = NR_OPEN;
The problem is that roundup_pow_of_two() will not necessarily bring the array
up to the necessary size, and we get an array overflow. This is clearly
visible in the example above: dup2(..., 524) with a PAGE_SIZE of 4K. (Thanks
for sending that in, Dave.) Let me think about the best way to fix this
computation, and I'll send out a patch for you folks to test to see if it
fixes your problem, if you'll oblige.

-- Vadim Lobanov, idiot of the day
Michal Piotrowski
2006-10-10 16:09:31 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
This looks strange
ps aux | grep t3
root 4305 81.6 0.1 5952 2596 pts/7 R+ 17:54 2:44
python ./rt-tester.py t3-l1-pi-steal.tst
michal 4351 0.0 0.0 3908 760 pts/5 R+ 17:58 0:00 grep t3
[***@euridica ~]$ ps aux | grep creat
root 3934 87.3 0.0 1652 496 pts/4 R 17:25 28:37 creat05
michal 4353 0.0 0.0 3912 772 pts/5 S+ 17:58 0:00 grep creat

python ./rt-tester.py t3-l1-pi-steal.tst and creat05 (from LTP) are
always in running state (creat05 since 28 minutes). I don't have any
idea why this happens.

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)
Andrew Morton
2006-10-10 19:04:41 UTC
Permalink
On Tue, 10 Oct 2006 18:09:31 +0200
Post by Michal Piotrowski
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
This looks strange
ps aux | grep t3
root 4305 81.6 0.1 5952 2596 pts/7 R+ 17:54 2:44
python ./rt-tester.py t3-l1-pi-steal.tst
michal 4351 0.0 0.0 3908 760 pts/5 R+ 17:58 0:00 grep t3
root 3934 87.3 0.0 1652 496 pts/4 R 17:25 28:37 creat05
michal 4353 0.0 0.0 3912 772 pts/5 S+ 17:58 0:00 grep creat
python ./rt-tester.py t3-l1-pi-steal.tst and creat05 (from LTP) are
always in running state (creat05 since 28 minutes). I don't have any
idea why this happens.
The fdtable patches might have some problems.

http://userweb.kernel.org/~akpm/mp.bz2 is 2.6.19-rc1-mm1 without those
patches. Does it work better?

Thanks.
Michal Piotrowski
2006-10-10 21:44:04 UTC
Permalink
Post by Andrew Morton
On Tue, 10 Oct 2006 18:09:31 +0200
Post by Michal Piotrowski
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
This looks strange
ps aux | grep t3
root 4305 81.6 0.1 5952 2596 pts/7 R+ 17:54 2:44
python ./rt-tester.py t3-l1-pi-steal.tst
michal 4351 0.0 0.0 3908 760 pts/5 R+ 17:58 0:00 grep t3
root 3934 87.3 0.0 1652 496 pts/4 R 17:25 28:37 creat05
michal 4353 0.0 0.0 3912 772 pts/5 S+ 17:58 0:00 grep creat
python ./rt-tester.py t3-l1-pi-steal.tst and creat05 (from LTP) are
always in running state (creat05 since 28 minutes). I don't have any
idea why this happens.
The fdtable patches might have some problems.
http://userweb.kernel.org/~akpm/mp.bz2 is 2.6.19-rc1-mm1 without those
patches. Does it work better?
Yes, it does. Thanks.

BTW. Kernel hangs while running Cyclictest
(http://rt.wiki.kernel.org/index.php/Cyclictest)
cyclictest -t 10 -l 100000
(or "bin/autotest tests/cyclictest/control" in autotest). I don't see
nothing special on tty (currently my sysklogd is broken, FC6
problem..)

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)
Andrew Morton
2006-10-10 21:52:35 UTC
Permalink
On Tue, 10 Oct 2006 23:44:04 +0200
Post by Michal Piotrowski
BTW. Kernel hangs while running Cyclictest
(http://rt.wiki.kernel.org/index.php/Cyclictest)
cyclictest -t 10 -l 100000
(or "bin/autotest tests/cyclictest/control" in autotest). I don't see
nothing special on tty (currently my sysklogd is broken, FC6
problem..)
cc added.
Thomas Gleixner
2006-10-20 20:44:35 UTC
Permalink
Post by Andrew Morton
On Tue, 10 Oct 2006 23:44:04 +0200
Post by Michal Piotrowski
BTW. Kernel hangs while running Cyclictest
(http://rt.wiki.kernel.org/index.php/Cyclictest)
cyclictest -t 10 -l 100000
(or "bin/autotest tests/cyclictest/control" in autotest). I don't see
nothing special on tty (currently my sysklogd is broken, FC6
problem..)
Michal,

is this on a SMP box ?

tglx
Olof Johansson
2006-10-10 17:15:19 UTC
Permalink
I keep hitting this on -rc1-mm1. The system comes up but I can't login
since login hits it.

Bisect says that fdtable-implement-new-pagesize-based-fdtable-allocator.patch is at fault.

CONFIG_PPC_64K_PAGES=y is required for it to fail, with 4K pages it's fine.

(Hardware is a Quad G5, 1GB RAM, g5_defconfig + CONFIG_PPC_64K_PAGES, defaults
on all new options)



kernel BUG in copy_fdtable at fs/file.c:138!
Oops: Exception in kernel mode, sig: 5 [#1]
SMP NR_CPUS=4
Modules linked in: snd_pcm_oss snd_mixer_oss joydev snd_pcm snd_page_alloc snd_timer snd soundcore
NIP: C0000000000B926C LR: C0000000000B9210 CTR: C0000000000ADA34
REGS: c00000000f80ba60 TRAP: 0700 Not tainted (2.6.19-rc1-mm1)
MSR: 9000000000029032 <EE,ME,IR,DR> CR: 22002428 XER: 200FFFFF
TASK = c00000000fe38ba0[2022] 'dhclient-script' THREAD: c00000000f808000 CPU: 2
GPR00: 0000000000000001 C00000000F80BCE0 C0000000006CDCC8 C00000003F8CF880
GPR04: 00000000000000D0 0000000042002428 0000000000004000 000000000FEC1950
GPR08: C000000000585480 0000000000000000 C00000003FFD0480 C000000001E04200
GPR12: 100000000000F032 C000000000585480 0000000010000000 0000000010010000
GPR16: 0000000010002790 000000001001819D 00000000FCCDF818 0000000000000000
GPR20: 0000000000000000 00000000FCCDF828 0000000010018790 0000000010018760
GPR24: 00000000F7FDE6F0 C00000003F8CF800 0000000000000000 C00000003F8CF810
GPR28: 0000000000000040 0000000000000000 C0000000005B4B00 C00000000FB511C0
NIP [C0000000000B926C] .expand_files+0x1d4/0x34c
LR [C0000000000B9210] .expand_files+0x178/0x34c
Call Trace:
[C00000000F80BCE0] [C0000000000B91D0] .expand_files+0x138/0x34c (unreliable)
[C00000000F80BD90] [C0000000000ADAD8] .sys_dup2+0xa4/0x1ec
[C00000000F80BE30] [C00000000000871C] syscall_exit+0x0/0x40
Instruction dump:
60000000 4800000c 4bfd3609 60000000 7fe3fb78 4bfdf125 60000000 48000150
835f0000 7f9ae040 7c101026 5400effe <0b000000> 2fbc0000 419e00a4 7b9d1828



-Olof
Andrew Morton
2006-10-10 19:34:49 UTC
Permalink
On Tue, 10 Oct 2006 12:15:19 -0500
Post by Olof Johansson
I keep hitting this on -rc1-mm1. The system comes up but I can't login
since login hits it.
Bisect says that fdtable-implement-new-pagesize-based-fdtable-allocator.patch is at fault.
CONFIG_PPC_64K_PAGES=y is required for it to fail, with 4K pages it's fine.
(Hardware is a Quad G5, 1GB RAM, g5_defconfig + CONFIG_PPC_64K_PAGES, defaults
on all new options)
kernel BUG in copy_fdtable at fs/file.c:138!
OK, thanks. I put the revert patch into
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/hot-fixes/
Linas Vepstas
2006-10-10 20:20:34 UTC
Permalink
Post by Olof Johansson
I keep hitting this on -rc1-mm1. The system comes up but I can't login
since login hits it.
Bisect says that fdtable-implement-new-pagesize-based-fdtable-allocator.patch is at fault.
CONFIG_PPC_64K_PAGES=y is required for it to fail, with 4K pages it's fine.
(Hardware is a Quad G5, 1GB RAM, g5_defconfig + CONFIG_PPC_64K_PAGES, defaults
on all new options)
kernel BUG in copy_fdtable at fs/file.c:138!
FWIW, I too was hitting this bug, during init:

[ 41.659823] Freeing unused kernel memory: 320k freed
INIT: version 2.86 bootin[ 42.509322] kernel BUG in copy_fdtable at fs/file.c:138!

and of course systm does not come up.

--linas
Vadim Lobanov
2006-10-10 20:31:11 UTC
Permalink
Post by Linas Vepstas
Post by Olof Johansson
I keep hitting this on -rc1-mm1. The system comes up but I can't login
since login hits it.
Bisect says that
fdtable-implement-new-pagesize-based-fdtable-allocator.patch is at fault.
CONFIG_PPC_64K_PAGES=y is required for it to fail, with 4K pages it's fine.
(Hardware is a Quad G5, 1GB RAM, g5_defconfig + CONFIG_PPC_64K_PAGES,
defaults on all new options)
kernel BUG in copy_fdtable at fs/file.c:138!
[ 41.659823] Freeing unused kernel memory: 320k freed
INIT: version 2.86 bootin[ 42.509322] kernel BUG in copy_fdtable at fs/file.c:138!
and of course systm does not come up.
--linas
I'm digging through this right now, trying to figure out exactly what went
wrong (and why some people are seeing this, while others are not). All the
code seems correct; another pair of eyes is always welcome though.

-- Vadim Lobanov
Linas Vepstas
2006-10-10 23:05:23 UTC
Permalink
Post by Vadim Lobanov
Post by Linas Vepstas
Post by Olof Johansson
I keep hitting this on -rc1-mm1. The system comes up but I can't login
since login hits it.
Bisect says that
fdtable-implement-new-pagesize-based-fdtable-allocator.patch is at fault.
CONFIG_PPC_64K_PAGES=y is required for it to fail, with 4K pages it's fine.
(Hardware is a Quad G5, 1GB RAM, g5_defconfig + CONFIG_PPC_64K_PAGES,
defaults on all new options)
kernel BUG in copy_fdtable at fs/file.c:138!
[ 41.659823] Freeing unused kernel memory: 320k freed
INIT: version 2.86 bootin[ 42.509322] kernel BUG in copy_fdtable at fs/file.c:138!
and of course systm does not come up.
I forgot to mention my h/w was completely different (a cell)
Post by Vadim Lobanov
I'm digging through this right now, trying to figure out exactly what went
wrong (and why some people are seeing this, while others are not). All the
code seems correct; another pair of eyes is always welcome though.
The patch that AKPM just posted at

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/hot-fixes/revert-fdtable-implement-new-pagesize-based-fdtable-allocator.patch

boots for me.

Thanks Andrew!

--linas
Jeremy Fitzhardinge
2006-10-10 18:09:32 UTC
Permalink
Post by Andrew Morton
+generic-implementatation-of-bug.patch
+generic-implementatation-of-bug-fix.patch
+generic-bug-for-i386.patch
+generic-bug-for-x86-64.patch
+uml-add-generic-bug-support.patch
+use-generic-bug-for-ppc.patch
+bug-test-1.patch
No generic-bug for powerpc? Still broken?

J
Andrew Morton
2006-10-10 19:25:11 UTC
Permalink
On Tue, 10 Oct 2006 11:09:32 -0700
Post by Jeremy Fitzhardinge
Post by Andrew Morton
+generic-implementatation-of-bug.patch
+generic-implementatation-of-bug-fix.patch
+generic-bug-for-i386.patch
+generic-bug-for-x86-64.patch
+uml-add-generic-bug-support.patch
+use-generic-bug-for-ppc.patch
+bug-test-1.patch
No generic-bug for powerpc? Still broken?
I didn't do anything to fix it. But I haven't tested it recently - it
might have repaired itself ;)

My plan was to pathetically spam the powerpc guys with it once all the
above is merged up. I took a close look and couldn't see why it was
failing.
Jeremy Fitzhardinge
2006-10-10 19:41:53 UTC
Permalink
Post by Andrew Morton
I didn't do anything to fix it. But I haven't tested it recently - it
might have repaired itself ;)
Surely the way to check it just throw it in and see who screams...
Post by Andrew Morton
My plan was to pathetically spam the powerpc guys with it once all the
above is merged up. I took a close look and couldn't see why it was
failing.
Oh, I thought it turned out to be some other problem. Er, something
about numa memory stuff?

J
Paul Mackerras
2006-10-10 23:10:23 UTC
Permalink
Post by Andrew Morton
My plan was to pathetically spam the powerpc guys with it once all the
above is merged up. I took a close look and couldn't see why it was
failing.
What was the failure?

Paul.
Jeremy Fitzhardinge
2006-10-10 23:16:11 UTC
Permalink
Post by Paul Mackerras
Post by Andrew Morton
My plan was to pathetically spam the powerpc guys with it once all the
above is merged up. I took a close look and couldn't see why it was
failing.
What was the failure?
I've included it below. But Michael Ellerman said it worked OK for him
when applied to the plain Linus tree, and Andrew said that
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18/2.6.18-mm3/hot-fixes/slab-reduce-numa-text-size-tidy-fix.patch
fixed it. So I thought it was OK.

J
Post by Paul Mackerras
On Wed, 04 Oct 2006 00:22:33 -0600
Post by Andrew Morton
Subject: generic-implementatation-of-bug-fix
x86_64 works OK. powerpc compiles now, but hangs after "returning from
prom_init". I can't see why. The quickest way to fix this is to merge it
into mainline as-is then whistle innocently.
Can any of the powerpc guys spot-the-bug??
Thanks.
This makes powerpc use the generic BUG machinery. The biggest reports the
function name, since it is redundant with kallsyms, and not needed in general.
There is an overall reduction of code, since module_32/64 duplicated several
functions.
Unfortunately there's no way to tell gcc that BUG won't return, so the BUG
macro includes a goto loop. This will generate a real jmp instruction, which
is never used.
BTW, powerpc doesn't seem to be using BUG_OPCODE or BUG_ILLEGAL_INSTRUCTION
for actual BUGs any more (I presume they were once used). There are still a
couple of uses of those macros elsewhere (kernel/prom_init.c and
kernel/head_64.S); should be converted to "twi 31,0,0" as well?
---
arch/powerpc/Kconfig | 5 +
arch/powerpc/kernel/module_32.c | 43 +-------------
arch/powerpc/kernel/module_64.c | 43 +-------------
arch/powerpc/kernel/traps.c | 54 ++----------------
arch/powerpc/kernel/vmlinux.lds.S | 6 --
arch/powerpc/xmon/xmon.c | 10 +--
include/asm-powerpc/bug.h | 83 ++++++++++++++--------------
include/asm-powerpc/module.h | 2
8 files changed, 65 insertions(+), 181 deletions(-)
diff -puN arch/powerpc/Kconfig~generic-bug-for-powerpc arch/powerpc/Kconfig
--- a/arch/powerpc/Kconfig~generic-bug-for-powerpc
+++ a/arch/powerpc/Kconfig
@@ -99,6 +99,11 @@ config AUDIT_ARCH
bool
default y
+config GENERIC_BUG
+ bool
+ default y
+ depends on BUG
+
config DEFAULT_UIMAGE
bool
help
diff -puN arch/powerpc/kernel/module_32.c~generic-bug-for-powerpc arch/powerpc/kernel/module_32.c
--- a/arch/powerpc/kernel/module_32.c~generic-bug-for-powerpc
+++ a/arch/powerpc/kernel/module_32.c
@@ -23,6 +23,7 @@
#include <linux/string.h>
#include <linux/kernel.h>
#include <linux/cache.h>
+#include <linux/bug.h>
#if 0
#define DEBUGP printk
@@ -273,48 +274,10 @@ int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs,
struct module *me)
{
- char *secstrings;
- unsigned int i;
-
- me->arch.bug_table = NULL;
- me->arch.num_bugs = 0;
-
- /* Find the __bug_table section, if present */
- secstrings = (char *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
- for (i = 1; i < hdr->e_shnum; i++) {
- if (strcmp(secstrings+sechdrs[i].sh_name, "__bug_table"))
- continue;
- me->arch.bug_table = (void *) sechdrs[i].sh_addr;
- me->arch.num_bugs = sechdrs[i].sh_size / sizeof(struct bug_entry);
- break;
- }
-
- /*
- * Strictly speaking this should have a spinlock to protect against
- * traversals, but since we only traverse on BUG()s, a spinlock
- * could potentially lead to deadlock and thus be counter-productive.
- */
- list_add(&me->arch.bug_list, &module_bug_list);
-
- return 0;
+ return module_bug_finalize(hdr, sechdrs, me);
}
void module_arch_cleanup(struct module *mod)
{
- list_del(&mod->arch.bug_list);
-}
-
-struct bug_entry *module_find_bug(unsigned long bugaddr)
-{
- struct mod_arch_specific *mod;
- unsigned int i;
- struct bug_entry *bug;
-
- list_for_each_entry(mod, &module_bug_list, bug_list) {
- bug = mod->bug_table;
- for (i = 0; i < mod->num_bugs; ++i, ++bug)
- if (bugaddr == bug->bug_addr)
- return bug;
- }
- return NULL;
+ module_bug_cleanup(mod);
}
diff -puN arch/powerpc/kernel/module_64.c~generic-bug-for-powerpc arch/powerpc/kernel/module_64.c
--- a/arch/powerpc/kernel/module_64.c~generic-bug-for-powerpc
+++ a/arch/powerpc/kernel/module_64.c
@@ -20,6 +20,7 @@
#include <linux/moduleloader.h>
#include <linux/err.h>
#include <linux/vmalloc.h>
+#include <linux/bug.h>
#include <asm/module.h>
#include <asm/uaccess.h>
@@ -416,48 +417,10 @@ LIST_HEAD(module_bug_list);
int module_finalize(const Elf_Ehdr *hdr,
const Elf_Shdr *sechdrs, struct module *me)
{
- char *secstrings;
- unsigned int i;
-
- me->arch.bug_table = NULL;
- me->arch.num_bugs = 0;
-
- /* Find the __bug_table section, if present */
- secstrings = (char *)hdr + sechdrs[hdr->e_shstrndx].sh_offset;
- for (i = 1; i < hdr->e_shnum; i++) {
- if (strcmp(secstrings+sechdrs[i].sh_name, "__bug_table"))
- continue;
- me->arch.bug_table = (void *) sechdrs[i].sh_addr;
- me->arch.num_bugs = sechdrs[i].sh_size / sizeof(struct bug_entry);
- break;
- }
-
- /*
- * Strictly speaking this should have a spinlock to protect against
- * traversals, but since we only traverse on BUG()s, a spinlock
- * could potentially lead to deadlock and thus be counter-productive.
- */
- list_add(&me->arch.bug_list, &module_bug_list);
-
- return 0;
+ return module_bug_finalize(hdr, sechdrs, me);
}
void module_arch_cleanup(struct module *mod)
{
- list_del(&mod->arch.bug_list);
-}
-
-struct bug_entry *module_find_bug(unsigned long bugaddr)
-{
- struct mod_arch_specific *mod;
- unsigned int i;
- struct bug_entry *bug;
-
- list_for_each_entry(mod, &module_bug_list, bug_list) {
- bug = mod->bug_table;
- for (i = 0; i < mod->num_bugs; ++i, ++bug)
- if (bugaddr == bug->bug_addr)
- return bug;
- }
- return NULL;
+ module_bug_cleanup(mod);
}
diff -puN arch/powerpc/kernel/traps.c~generic-bug-for-powerpc arch/powerpc/kernel/traps.c
--- a/arch/powerpc/kernel/traps.c~generic-bug-for-powerpc
+++ a/arch/powerpc/kernel/traps.c
@@ -32,6 +32,7 @@
#include <linux/kprobes.h>
#include <linux/kexec.h>
#include <linux/backlight.h>
+#include <linux/bug.h>
#include <asm/kdebug.h>
#include <asm/pgtable.h>
@@ -731,54 +732,9 @@ static int emulate_instruction(struct pt
return -EINVAL;
}
-/*
- * Look through the list of trap instructions that are used for BUG(),
- * BUG_ON() and WARN_ON() and see if we hit one. At this point we know
- * that the exception was caused by a trap instruction of some kind.
- * Returns 1 if we should continue (i.e. it was a WARN_ON) or 0
- * otherwise.
- */
-extern struct bug_entry __start___bug_table[], __stop___bug_table[];
-
-#ifndef CONFIG_MODULES
-#define module_find_bug(x) NULL
-#endif
-
-struct bug_entry *find_bug(unsigned long bugaddr)
+int is_valid_bugaddr(unsigned long addr)
{
- struct bug_entry *bug;
-
- for (bug = __start___bug_table; bug < __stop___bug_table; ++bug)
- if (bugaddr == bug->bug_addr)
- return bug;
- return module_find_bug(bugaddr);
-}
-
-static int check_bug_trap(struct pt_regs *regs)
-{
- struct bug_entry *bug;
- unsigned long addr;
-
- if (regs->msr & MSR_PR)
- return 0; /* not in kernel */
- addr = regs->nip; /* address of trap instruction */
- if (addr < PAGE_OFFSET)
- return 0;
- bug = find_bug(regs->nip);
- if (bug == NULL)
- return 0;
- if (bug->line & BUG_WARNING_TRAP) {
- /* this is a WARN_ON rather than BUG/BUG_ON */
- printk(KERN_ERR "Badness in %s at %s:%ld\n",
- bug->function, bug->file,
- bug->line & ~BUG_WARNING_TRAP);
- dump_stack();
- return 1;
- }
- printk(KERN_CRIT "kernel BUG in %s at %s:%ld!\n",
- bug->function, bug->file, bug->line);
-
- return 0;
+ return is_kernel_addr(addr);
}
void __kprobes program_check_exception(struct pt_regs *regs)
@@ -812,7 +768,9 @@ void __kprobes program_check_exception(s
return;
if (debugger_bpt(regs))
return;
- if (check_bug_trap(regs)) {
+
+ if (!(regs->msr & MSR_PR) && /* not user-mode */
+ report_bug(regs->nip) == BUG_TRAP_TYPE_WARN) {
regs->nip += 4;
return;
}
diff -puN arch/powerpc/kernel/vmlinux.lds.S~generic-bug-for-powerpc arch/powerpc/kernel/vmlinux.lds.S
--- a/arch/powerpc/kernel/vmlinux.lds.S~generic-bug-for-powerpc
+++ a/arch/powerpc/kernel/vmlinux.lds.S
@@ -62,11 +62,7 @@ SECTIONS
__stop___ex_table = .;
}
- __bug_table : {
- __start___bug_table = .;
- *(__bug_table)
- __stop___bug_table = .;
- }
+ BUG_TABLE
/*
* Init sections discarded at runtime
diff -puN arch/powerpc/xmon/xmon.c~generic-bug-for-powerpc arch/powerpc/xmon/xmon.c
--- a/arch/powerpc/xmon/xmon.c~generic-bug-for-powerpc
+++ a/arch/powerpc/xmon/xmon.c
@@ -19,6 +19,7 @@
#include <linux/module.h>
#include <linux/sysrq.h>
#include <linux/interrupt.h>
+#include <linux/bug.h>
#include <asm/ptrace.h>
#include <asm/string.h>
@@ -32,7 +33,6 @@
#include <asm/cputable.h>
#include <asm/rtas.h>
#include <asm/sstep.h>
-#include <asm/bug.h>
#ifdef CONFIG_PPC64
#include <asm/hvcall.h>
@@ -1329,7 +1329,7 @@ static void backtrace(struct pt_regs *ex
static void print_bug_trap(struct pt_regs *regs)
{
- struct bug_entry *bug;
+ const struct bug_entry *bug;
unsigned long addr;
if (regs->msr & MSR_PR)
@@ -1340,11 +1340,11 @@ static void print_bug_trap(struct pt_reg
bug = find_bug(regs->nip);
if (bug == NULL)
return;
- if (bug->line & BUG_WARNING_TRAP)
+ if (is_warning_bug(bug))
return;
- printf("kernel BUG in %s at %s:%d!\n",
- bug->function, bug->file, (unsigned int)bug->line);
+ printf("kernel BUG at %s:%u!\n",
+ bug->file, bug->line);
}
void excprint(struct pt_regs *fp)
diff -puN include/asm-powerpc/bug.h~generic-bug-for-powerpc include/asm-powerpc/bug.h
--- a/include/asm-powerpc/bug.h~generic-bug-for-powerpc
+++ a/include/asm-powerpc/bug.h
@@ -13,37 +13,40 @@
#ifndef __ASSEMBLY__
-struct bug_entry {
- unsigned long bug_addr;
- long line;
- const char *file;
- const char *function;
-};
-
-struct bug_entry *find_bug(unsigned long bugaddr);
-
-/*
- * If this bit is set in the line number it means that the trap
- * is for WARN_ON rather than BUG or BUG_ON.
- */
-#define BUG_WARNING_TRAP 0x1000000
-
#ifdef CONFIG_BUG
+/* _EMIT_BUG_ENTRY expects args %0,%1,%2,%3 to be FILE, LINE, flags and
+ sizeof(struct bug_entry), respectively */
+#ifdef CONFIG_DEBUG_BUGVERBOSE
+#define _EMIT_BUG_ENTRY \
+ ".section __bug_table,\"a\"\n" \
+ "2:\t" PPC_LONG "1b, %0\n" \
+ "\t.short %1, %2\n" \
+ ".org 2b+%3\n" \
+ ".previous\n"
+#else
+#define _EMIT_BUG_ENTRY \
+ ".section __bug_table,\"a\"\n" \
+ "2:\t" PPC_LONG "1b\n" \
+ "\t.short %2\n" \
+ ".org 2b+%3\n" \
+ ".previous\n"
+#endif
+
/*
* BUG_ON() and WARN_ON() do their best to cooperate with compile-time
* optimisations. However depending on the complexity of the condition
* some compiler versions may not produce optimal results.
*/
-#define BUG() do { \
- __asm__ __volatile__( \
- "1: twi 31,0,0\n" \
- ".section __bug_table,\"a\"\n" \
- "\t"PPC_LONG" 1b,%0,%1,%2\n" \
- ".previous" \
- : : "i" (__LINE__), "i" (__FILE__), "i" (__FUNCTION__)); \
-} while (0)
+#define BUG() do { \
+ __asm__ __volatile__( \
+ "1: twi 31,0,0\n" \
+ _EMIT_BUG_ENTRY \
+ : : "i" (__FILE__), "i" (__LINE__), \
+ "i" (0), "i" (sizeof(struct bug_entry))); \
+ for(;;) ; \
+ } while (0)
#define BUG_ON(x) do { \
if (__builtin_constant_p(x)) { \
@@ -51,23 +54,22 @@ struct bug_entry *find_bug(unsigned long
BUG(); \
} else { \
__asm__ __volatile__( \
- "1: "PPC_TLNEI" %0,0\n" \
- ".section __bug_table,\"a\"\n" \
- "\t"PPC_LONG" 1b,%1,%2,%3\n" \
- ".previous" \
- : : "r" ((long)(x)), "i" (__LINE__), \
- "i" (__FILE__), "i" (__FUNCTION__)); \
+ "1: "PPC_TLNEI" %4,0\n" \
+ _EMIT_BUG_ENTRY \
+ : : "i" (__FILE__), "i" (__LINE__), "i" (0), \
+ "i" (sizeof(struct bug_entry)), \
+ "r" ((long)(x))); \
+ for(;;) ; \
} \
} while (0)
#define __WARN() do { \
__asm__ __volatile__( \
"1: twi 31,0,0\n" \
- ".section __bug_table,\"a\"\n" \
- "\t"PPC_LONG" 1b,%0,%1,%2\n" \
- ".previous" \
- : : "i" (__LINE__ + BUG_WARNING_TRAP), \
- "i" (__FILE__), "i" (__FUNCTION__)); \
+ _EMIT_BUG_ENTRY \
+ : : "i" (__FILE__), "i" (__LINE__), \
+ "i" (BUGFLAG_WARNING), \
+ "i" (sizeof(struct bug_entry))); \
} while (0)
#define WARN_ON(x) ({ \
@@ -77,13 +79,12 @@ struct bug_entry *find_bug(unsigned long
__WARN(); \
} else { \
__asm__ __volatile__( \
- "1: "PPC_TLNEI" %0,0\n" \
- ".section __bug_table,\"a\"\n" \
- "\t"PPC_LONG" 1b,%1,%2,%3\n" \
- ".previous" \
- : : "r" (__ret_warn_on), \
- "i" (__LINE__ + BUG_WARNING_TRAP), \
- "i" (__FILE__), "i" (__FUNCTION__)); \
+ "1: "PPC_TLNEI" %4,0\n" \
+ _EMIT_BUG_ENTRY \
+ : : "i" (__FILE__), "i" (__LINE__), \
+ "i" (BUGFLAG_WARNING), \
+ "i" (sizeof(struct bug_entry)), \
+ "r" (__ret_warn_on)); \
} \
unlikely(__ret_warn_on); \
})
diff -puN include/asm-powerpc/module.h~generic-bug-for-powerpc include/asm-powerpc/module.h
--- a/include/asm-powerpc/module.h~generic-bug-for-powerpc
+++ a/include/asm-powerpc/module.h
@@ -46,8 +46,6 @@ struct mod_arch_specific {
unsigned int num_bugs;
};
-extern struct bug_entry *module_find_bug(unsigned long bugaddr);
-
/*
* Select ELF headers.
* Make empty section for module_frob_arch_sections to expand.
_
Andrew Morton
2006-10-10 23:37:02 UTC
Permalink
On Wed, 11 Oct 2006 09:10:23 +1000
Post by Paul Mackerras
Post by Andrew Morton
My plan was to pathetically spam the powerpc guys with it once all the
above is merged up. I took a close look and couldn't see why it was
failing.
What was the failure?
White-screen hang after "returning from prom_init".

I'll merge Jeremy's latest patches and retest this evening.
Badari Pulavarty
2006-10-10 22:17:15 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
I hate to report always failures, I hope you don't hate me
for that :)

My EM64T box doesn't boot -mm1. Seems like IRQ problem ?

Thanks,
Badari

Linux version 2.6.19-rc1-mm1-smp (***@elm3a241) (gcc version 4.1.1 20060612 (Red Hat 4.1.1-3)) #2 SMP Tue Oct 10 10:37:46 PDT 2006
Command line: ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,38400 rhgb verbose
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009dc00 (usable)
BIOS-e820: 000000000009dc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000d7fcca80 (usable)
BIOS-e820: 00000000d7fcca80 - 00000000d7fd0000 (ACPI data)
BIOS-e820: 00000000d7fd0000 - 00000000d8000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000128000000 (usable)
end_pfn_map = 1212416
DMI 2.3 present.
Post by Andrew Morton
ERROR: Invalid checksum
No NUMA configuration found
Faking a node at 0000000000000000-0000000128000000
Bootmem setup node 0 0000000000000000-0000000128000000
Zone PFN ranges:
DMA 0 -> 4096
DMA32 4096 -> 1048576
Normal 1048576 -> 1212416
early_node_map[3] active PFN ranges
0: 0 -> 157
0: 256 -> 884684
0: 1048576 -> 1212416
Intel MultiProcessor Specification v1.4
MPTABLE: OEM ID: IBM ENSW MPTABLE: Product ID: x346 SMP MPTABLE: APIC at: 0xFEE00000
Processor #0 (Bootup-CPU)
Processor #6
I/O APIC #14 at 0xFEC00000.
I/O APIC #13 at 0xFEC84000.
I/O APIC #12 at 0xFEC84400.
I/O APIC #11 at 0xFEC80000.
I/O APIC #10 at 0xFEC80400.
Setting APIC routing to physical flat
Processors: 2
Nosave address range: 000000000009d000 - 000000000009e000
Nosave address range: 000000000009e000 - 00000000000a0000
Nosave address range: 00000000000a0000 - 00000000000e0000
Nosave address range: 00000000000e0000 - 0000000000100000
Nosave address range: 00000000d7fcc000 - 00000000d7fcd000
Nosave address range: 00000000d7fcd000 - 00000000d7fd0000
Nosave address range: 00000000d7fd0000 - 00000000d8000000
Nosave address range: 00000000d8000000 - 00000000fec00000
Nosave address range: 00000000fec00000 - 0000000100000000
Allocating PCI resources starting at dc000000 (gap: d8000000:26c00000)
SMP: Allowing 2 CPUs, 0 hotplug CPUs
PERCPU: Allocating 50432 bytes of per cpu data
Built 1 zonelists. Total pages: 1019681
Kernel command line: ro root=/dev/VolGroup00/LogVol00 console=tty0 console=ttyS0,38400 rhgb verbose
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 32768 bytes)
Console: colour VGA+ 80x25
Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES: 8
... MAX_LOCK_DEPTH: 30
... MAX_LOCKDEP_KEYS: 2048
... CLASSHASH_SIZE: 1024
... MAX_LOCKDEP_ENTRIES: 8192
... MAX_LOCKDEP_CHAINS: 8192
... CHAINHASH_SIZE: 4096
memory used by lock dependency info: 1328 kB
per task-struct memory footprint: 1680 bytes
Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
Checking aperture...
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
Placing software IO TLB between 0x1675000 - 0x5675000
Memory: 4004408k/4849664k available (2714k kernel code, 189292k reserved, 2094k data, 328k init)
Calibrating delay using timer specific routine.. 6014.59 BogoMIPS (lpj=12029192)
Security Framework v1.0.0 initialized
Mount-cache hash table entries: 256
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU 0/0 -> Node 0
using mwait in idle threads.
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU0: Thermal monitoring enabled (TM1)
lockdep: not fixing up alternatives.
Using local APIC timer interrupts.
result 12500678
Detected 12.500 MHz APIC timer.
lockdep: not fixing up alternatives.
Booting processor 1/2 APIC 0x6
Initializing CPU#1
Calibrating delay using timer specific routine.. 6000.53 BogoMIPS (lpj=12001079)
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 2048K
CPU 1/6 -> Node 0
CPU: Physical Processor ID: 3
CPU: Processor Core ID: 0
CPU1: Thermal monitoring enabled (TM1)
Intel(R) Xeon(TM) CPU 3.00GHz stepping 03
Brought up 2 CPUs
testing NMI watchdog ... OK.
time.c: Using 1.193182 MHz WALL PIT GTOD PIT/TSC timer.
time.c: Detected 3000.178 MHz processor.
migration_cost=1820
checking if image is initramfs... it is
Freeing initrd memory: 1743k freed
NET: Registered protocol family 16
PCI: Using configuration type 1
ACPI: Interpreter disabled.
SCSI subsystem initialized
PCI: Probing PCI hardware
PCI quirk: region 0580-05ff claimed by ICH4 ACPI/GPIO/TCO
PCI quirk: region 0400-043f claimed by ICH4 GPIO
PCI: PXH quirk detected, disabling MSI for SHPC device
PCI: PXH quirk detected, disabling MSI for SHPC device
PCI: Transparent bridge - 0000:00:1e.0
PCI->APIC IRQ transform: 0000:00:1d.0[A] -> IRQ 16
PCI->APIC IRQ transform: 0000:00:1d.1[B] -> IRQ 19
PCI->APIC IRQ transform: 0000:00:1d.7[D] -> IRQ 23
PCI->APIC IRQ transform: 0000:00:1f.1[A] -> IRQ 17
PCI->APIC IRQ transform: 0000:00:1f.3[B] -> IRQ 17
PCI->APIC IRQ transform: 0000:05:00.0[A] -> IRQ 16
PCI->APIC IRQ transform: 0000:06:00.0[A] -> IRQ 16
PCI->APIC IRQ transform: 0000:08:07.0[A] -> IRQ 27
PCI->APIC IRQ transform: 0000:08:07.1[B] -> IRQ 24
PCI->APIC IRQ transform: 0000:01:06.0[A] -> IRQ 20
PCI-GART: No AMD northbridge found.
PCI: Bridge: 0000:02:00.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:02:00.2
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:02.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:04.0
IO window: disabled.
MEM window: dd000000-deffffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:05.0
IO window: disabled.
MEM window: db000000-dcffffff
PREFETCH window: disabled.
PCI: Bridge: 0000:07:00.0
IO window: 4000-4fff
MEM window: d9000000-daffffff
PREFETCH window: df000000-df0fffff
PCI: Bridge: 0000:07:00.2
IO window: 5000-ffff
MEM window: disabled.
PREFETCH window: disabled.
PCI: Bridge: 0000:00:06.0
IO window: 4000-ffff
MEM window: d9000000-daffffff
PREFETCH window: df000000-df0fffff
PCI: Bridge: 0000:00:1e.0
IO window: 3000-3fff
MEM window: f8000000-f8ffffff
PREFETCH window: f0000000-f7ffffff
PCI: No IRQ known for interrupt pin A of device 0000:00:02.0. Probably buggy MP table.
PCI: No IRQ known for interrupt pin A of device 0000:00:04.0. Probably buggy MP table.
PCI: No IRQ known for interrupt pin A of device 0000:00:05.0. Probably buggy MP table.
PCI: No IRQ known for interrupt pin A of device 0000:00:06.0. Probably buggy MP table.
NET: Registered protocol family 2
IP route cache hash table entries: 131072 (order: 8, 1048576 bytes)
TCP established hash table entries: 65536 (order: 9, 3670016 bytes)
TCP bind hash table entries: 32768 (order: 8, 1835008 bytes)
TCP: Hash tables configured (established 65536 bind 32768)
TCP reno registered
audit: initializing netlink socket (disabled)
audit(1160515309.208:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
pcie_portdrv_probe->Dev[3595:8086] has invalid IRQ. Check vendor BIOS
pcie_portdrv_probe->Dev[3597:8086] has invalid IRQ. Check vendor BIOS
pcie_portdrv_probe->Dev[3598:8086] has invalid IRQ. Check vendor BIOS
pcie_portdrv_probe->Dev[3599:8086] has invalid IRQ. Check vendor BIOS
Real Time Clock Driver v1.12ac
Non-volatile memory driver v1.2
Linux agpgart interface v0.101 (c) Dave Jones
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 128000K size 1024 blocksize
tg3.c:v3.66 (September 23, 2006)
eth0: Tigon3 [partno(BCM95721) rev 4101 PHY(5750)] (PCI Express) 10/100/1000BaseT Ethernet 00:11:25:8f:60:06
eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[1]
eth0: dma_rwctrl[76180000] dma_mask[64-bit]
eth1: Tigon3 [partno(BCM95721) rev 4101 PHY(5750)] (PCI Express) 10/100/1000BaseT Ethernet 00:11:25:8f:60:07
eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] Split[0] WireSpeed[1] TSOcap[1]
eth1: dma_rwctrl[76180000] dma_mask[64-bit]
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs
scsi1 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 3.0
<Adaptec AIC7902 Ultra320 SCSI adapter>
aic7902: Ultra320 Wide Channel B, SCSI Id=7, PCI-X 101-133Mhz, 512 SCBs
scsi 1:0:0:0: Direct-Access IBM-ESXS MAP3367NC FN C101 PQ: 0 ANSI: 3
target1:0:0: asynchronous
scsi1:A:0:0: Tagged Queuing enabled. Depth 32
target1:0:0: Beginning Domain Validation
target1:0:0: wide asynchronous
target1:0:0: FAST-160 WIDE SCSI 320.0 MB/s DT IU RTI PCOMP (6.25 ns, offset 127)
target1:0:0: Ending Domain Validation
scsi 1:0:2:0: Direct-Access IBM-ESXS MAP3367NC FN C101 PQ: 0 ANSI: 3
target1:0:2: asynchronous
scsi1:A:2:0: Tagged Queuing enabled. Depth 32
target1:0:2: Beginning Domain Validation
target1:0:2: wide asynchronous
target1:0:2: FAST-160 WIDE SCSI 320.0 MB/s DT IU RTI PCOMP (6.25 ns, offset 127)
target1:0:2: Ending Domain Validation
scsi 1:0:8:0: Processor IBM 25R5170a S320 0 1 PQ: 0 ANSI: 2
target1:0:8: asynchronous
target1:0:8: Beginning Domain Validation
target1:0:8: Ending Domain Validation
SCSI device sda: 71096640 512-byte hdwr sectors (36401 MB)
sda: Write Protect is off
SCSI device sda: drive cache: write through
SCSI device sda: 71096640 512-byte hdwr sectors (36401 MB)
sda: Write Protect is off
SCSI device sda: drive cache: write through
sda: sda1 sda2
sd 1:0:0:0: Attached scsi disk sda
SCSI device sdb: 71096640 512-byte hdwr sectors (36401 MB)
sdb: Write Protect is off
SCSI device sdb: drive cache: write through
SCSI device sdb: 71096640 512-byte hdwr sectors (36401 MB)
sdb: Write Protect is off
SCSI device sdb: drive cache: write through
sdb: sdb1 sdb2
sd 1:0:2:0: Attached scsi disk sdb
sd 1:0:0:0: Attached scsi generic sg0 type 0
sd 1:0:2:0: Attached scsi generic sg1 type 0
scsi 1:0:8:0: Attached scsi generic sg2 type 3
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
input: PC Speaker as /class/input/input0
NET: Registered protocol family 1
Freeing unused kernel memory: 328k freed
Write protecting the kernel read-only data: 565k
input: AT Translated Set 2 keyboard as /class/input/input1
Red Hat nash version 5.0.41 starting
Mounting proc filesystem
Mounting sysfs filesystem
Creating /dev
Creating initial device nodes
Setting up hotplug.
input: PS/2 Generic Mouse as /class/input/input2
Creating block device nodes.
Loading ide-core.ko module
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
Loading ide-disk.ko module
Loading dm-mod.ko module
device-mapper: ioctl: 4.10.0-ioctl (2006-09-14) initialised: dm-***@redhat.com
Loading dm-mirror.ko module
Loading dm-zero.ko module
Loading dm-snapshot.ko module
Making device-mapper control node
Scanning logical volumes
Reading all physical volumes. This may take a while...
Found volume group "VolGroup00" using metadata type lvm2
Activating logical volumes
2 logical volume(s) in volume group "VolGroup00" now active
Creating root device.
Mounting root filesystem.
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
Setting up other filesystems.
Setting up new root fs
no fstab.sys, mounting internal defaults
Switching to new root and running init.
unmounting old /dev
unmounting old /proc
unmounting old /proc/bus/usb
ERROR unmounting old /proc/bus/usb: No such file or directory
forcing unmount of /proc/bus/usb
unmounting old /sys
INIT: version 2.86 booting
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
Welcome to Red Hat Enterprise Linux
Press 'I' to enter interactive startup.
warning: process `date' used the removed sysctl system call
Setting clock (utc): Tue Oct 10 14:22:36 PDT 2006 [ OK ]
Starting udev: udevd[539]: add_to_rules: unknown key 'MODALIAS'
udevd[539]: add_to_rules: unknown key 'MODALIAS'
udevd[539]: add_to_rules: unknown key 'MODALIAS'
[ OK ]
Setting hostname elm3a241: [ OK ]
Setting up Logical Volume Management: 2 logical volume(s) in volume group "VolGroup00" now active
[ OK ]
Checking filesystems
Checking all file systems.
[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/VolGroup00/LogVol00
/dev/VolGroup00/LogVol00: clean, 454331/8339520 files, 6170765/8339456 blocks
[/sbin/fsck.ext3 (1) -- /boot] fsck.ext3 -a /dev/sda1
/boot: clean, 67/26104 files, 101516/104388 blocks
[ OK ]
Remounting root filesystem in read-write mode: [ OK ]
Mounting local filesystems: [ OK ]
Enabling local filesystem quotas: [ OK ]
Enabling local swap partitions: swapon: /dev/dm-1: Device or resource busy
[FAILED]
Enabling /etc/fstab swaps: [ OK ]
INIT: Entering runlevel: 3
Entering non-interactive startup
Starting readahead_early: Starting background readahead: [ OK ]
[ OK ]
FATAL: Error inserting acpi_cpufreq (/lib/modules/2.6.19-rc1-mm1-smp/kernel/arch/x86_64/kernel/cpufreq/acpi-cpufreq.ko): No such device
Applying ip6tables firewall rules: [ OK ]
Bringing up loopback interface: [ OK ]
Bringing up interface eth0: [ OK ]
Starting system logger: [ OK ]
Starting kernel logger: [ OK ]
Starting irqbalance: [ OK ]
Starting portmap: [ OK ]
Starting NFS statd: [ OK ]
Starting RPC idmapd: [ OK ]
Starting system message bus: [ OK ]
[ OK ] Bluetooth services:[ OK ]
Mounting other filesystems: [ OK ]
Starting hidd: [ OK ]
Loading autofs4: [ OK ]
Starting automount: [ OK ]
Starting smartd: [ OK ]
Starting hpiod: [ OK ]
Starting hpssd: [ OK ]
Starting cups: [ OK ]
Starting sshd: [ OK ]
Starting sendmail: [ OK ]
Starting sm-client: [ OK ]
Starting console mouse services: [ OK ]
Starting crond: [ OK ]
Starting xfs: [ OK ]
Starting anacron: [ OK ]
Starting atd: [ OK ]
Starting yum-updatesd: [ OK ]
Starting Avahi daemon: do_IRQ: 0.57 No irq handler for vector
Arjan van de Ven
2006-10-11 06:56:26 UTC
Permalink
Post by Badari Pulavarty
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
I hate to report always failures, I hope you don't hate me
for that :)
My EM64T box doesn't boot -mm1. Seems like IRQ problem ?
Starting Avahi daemon: do_IRQ: 0.57 No irq handler for vector
I'm seeing something simliar (different number though) a few minutes
after boot with yesterdays git snapshot... something is sick in irq
land...
--
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Badari Pulavarty
2006-10-11 03:13:49 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
- Grab updated e2fsprogs from
ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim/
Hi Ted,

e2fsprogs-1.39-tyt1-rollup.patch doesn't compile. (seems to be missing
percent.c). Can
you role up new version ? I had to apply individual patches to get it
working ..

Thanks,
Badari
Andrew Morton
2006-10-11 04:01:33 UTC
Permalink
On Tue, 10 Oct 2006 20:13:49 -0700
Post by Badari Pulavarty
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
- Grab updated e2fsprogs from
ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim/
Hi Ted,
e2fsprogs-1.39-tyt1-rollup.patch doesn't compile. (seems to be missing
percent.c). Can
you role up new version ? I had to apply individual patches to get it
working ..
http://userweb.kernel.org/~akpm/e2fsprogs-akpm.tar.gz is the version I
used. That's e2fsprogs-1.39 plus patches from
http://www.bullopensource.org/ext4/20060926/
Andrew Morton
2006-10-11 16:56:39 UTC
Permalink
On Wed, 11 Oct 2006 08:02:14 -0700
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
- Grab updated e2fsprogs from
ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim/
ext4 did not survive my simple stress tests.
I ran 4 copies of fsx on ext4dev filesystem (without extents) +
4 copies of fsx on ext4dev (with extents).
Machine hung after running for few hours. There are 4 fsx sigsegv
messages on the console and the last message on the console is
do_IRQ: 0.62 No irq handler for vector
Quite a few people are hitting that - it's related to Eric's recent
IRQ/APIC-routing changes.

I don't know why that would cause fsx to get a sigsegv though.
-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Badari Pulavarty
2006-10-11 17:08:30 UTC
Permalink
Post by Andrew Morton
On Wed, 11 Oct 2006 08:02:14 -0700
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
- Grab updated e2fsprogs from
ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim/
ext4 did not survive my simple stress tests.
I ran 4 copies of fsx on ext4dev filesystem (without extents) +
4 copies of fsx on ext4dev (with extents).
Machine hung after running for few hours. There are 4 fsx sigsegv
messages on the console and the last message on the console is
do_IRQ: 0.62 No irq handler for vector
Quite a few people are hitting that - it's related to Eric's recent
IRQ/APIC-routing changes.
I don't know why that would cause fsx to get a sigsegv though.
I don't think they are related. Hopefully once we figure out IRQ
problem, we can try to track down fsx problems.

Thanks,
Badari

-
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Theodore Tso
2006-10-11 12:51:08 UTC
Permalink
Post by Badari Pulavarty
Hi Ted,
e2fsprogs-1.39-tyt1-rollup.patch doesn't compile. (seems to be missing
percent.c). Can
you role up new version ? I had to apply individual patches to get it
working ..
OK, fixed. I've also created a slightly new structure in:

ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim

Each new version will be its own directory, i.e., e2fsprogs-1.39-tyt1,
with a symlink LATEST pointing at the most recent directory.

Within each directory, there will be a tarball of the complete
sources, as requested by akpm, as well a broken-out tar.gz file and a
single file that has all of the patches rolled up. I've regenerated
the rollup patches this time with feeling (and the -N diff option :-),
so once the new structure gets mirrored out from master.kernel.org to
ftp.kernel.org, you should be able to get the fixed rollup patch, as
well as a pre-patched tarball.

Regards,

- Ted
Martin J. Bligh
2006-10-11 19:54:24 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
fsx seems to fail now, across several different machines.

http://test.kernel.org/functional/index.html


and drill down under "regression" on the failing ones.

eg, see end of
http://test.kernel.org/abat/54516/debug/test.log.1 (i386)
and
http://test.kernel.org/abat/54503/debug/test.log.1 (x86_64)

-rc1 is OK (as is -rc1-git7)
Badari Pulavarty
2006-10-11 21:58:57 UTC
Permalink
Post by Martin J. Bligh
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
fsx seems to fail now, across several different machines.
http://test.kernel.org/functional/index.html
and drill down under "regression" on the failing ones.
eg, see end of
http://test.kernel.org/abat/54516/debug/test.log.1 (i386)
and
http://test.kernel.org/abat/54503/debug/test.log.1 (x86_64)
I am seeing fsx failures on 1k/2k ext3 filesystems, but not on 4k.
Do you know the filesystem type & blocksize ?

Thanks,
Badari
Andy Whitcroft
2006-10-16 15:56:35 UTC
Permalink
Post by Badari Pulavarty
Post by Martin J. Bligh
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
fsx seems to fail now, across several different machines.
http://test.kernel.org/functional/index.html
and drill down under "regression" on the failing ones.
eg, see end of
http://test.kernel.org/abat/54516/debug/test.log.1 (i386)
and
http://test.kernel.org/abat/54503/debug/test.log.1 (x86_64)
I am seeing fsx failures on 1k/2k ext3 filesystems, but not on 4k.
Do you know the filesystem type & blocksize ?
Ok. I've been poking at these results to try and get you these answers.
In the process I noted that the benchmark was recently reviewed and
overhauled. Looking at the changes it looks like we are now reporting
the results from some tests which are backgrounded for additional load,
which would not have previously been reported. So this might not be a
new phenomenon. We have some stable tests coming through now, so I
should be able to use them as a reference to be sure.

Here are the ones which are failing currently, note that the 139 at the
start is the exit status as reported by the shell, so SIGSEGV:

bl6-13: x86_64 ext3
-------------------
139 ./fsx-linux -l 500000 -r 4096 -t 2048 -w 2048 -Z -R -W -N 10000
test/junkfile
139 ./fsx-linux -N 10000 -o 128000 -A -l 500000 -r 512 -t 4096 -w 1024
-Z -R -W test/junkfile


elm3b239: x86_64 reiserfs
-------------------------
139 ./fsx-linux -N 10000 -o 8192 -A -l 500000 -r 1024 -t 2048 -w 2048 -Z
-R -W test/junkfile
139 ./fsx-linux -N 10000 -o 128000 -r 2048 -w 4096 -Z -R -W test/junkfile
139 ./fsx-linux -N 10000 -o 8192 -A -l 500000 -r 1024 -t 2048 -w 1024 -Z
-R -W test/junkfile

I have also seen the following style messages on 19-rc1-mm1:

short write: 0x15000 bytes instead of 0xf000

Note that this really does mean a _long_ write!

-apw
Martin J. Bligh
2006-10-11 19:59:19 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
-
Oh, and hangs in LTP.

x86_64 just hangs.
http://test.kernel.org/abat/54544/debug/test.log.1 (in something io-ish)


http://test.kernel.org/abat/54541/debug/test.log.1 (ppc64)
craps itself with

Modules linked in:
NIP: C0000000000CC520 LR: C0000000000CC4EC CTR: 0000000000000FF5
REGS: c00000069ae83940 TRAP: 0700 Not tainted (2.6.19-rc1-mm1-autokern1)
MSR: 8000000000029032 <EE,ME,IR,DR> CR: 28002424 XER: 20000000
TASK = c000000768026660[32160] 'openfile' THREAD: c00000069ae80000 CPU: 2
GPR00: 0000000000000001 C00000069AE83BC0 C00000000065E548 FFFFFFFFFFFFFFF4
GPR04: 00000000000000D0 0000000000000040 C0000006DF1D200A 0000000000000008
GPR08: 0000000000000001 0000000000000000 C00000003FFA3000 0000000000000000
GPR12: 000000000000F032 C00000000053E480 0000000000000000 0000000000000000
GPR16: 00000000100D5D40 00000000100CCFF8 00000000FF8352E0 00000000F800F7E0
GPR20: 00000000FF8352D0 0000000010000D38 0000000000000001 0000000000000003
GPR24: 00000000F67C9F80 FFFFFFFFFFFFFF9C C0000007750C4680 C0000007750C4690
GPR28: 0000000000000040 C000000775AF9CC0 C000000000570348 C000000775AF9CC0
NIP [C0000000000CC520] .expand_files+0x1f4/0x354
LR [C0000000000CC4EC] .expand_files+0x1c0/0x354
Call Trace:
[C00000069AE83BC0] [C0000000000CC470] .expand_files+0x144/0x354 (unreliable)
[C00000069AE83C60] [C0000000000AE148] .get_unused_fd+0x80/0x170
[C00000069AE83D00] [C0000000000AE6EC] .do_sys_open+0x5c/0x140
[C00000069AE83DB0] [C0000000000ED574] .compat_sys_open+0x24/0x38
[C00000069AE83E30] [C00000000000871C] syscall_exit+0x0/0x40


and another one from another ppc64 box

gekko-lp1 login:-- 0:conmux-control -- time-stamp -- Oct/10/06 9:38:14 --
cpu 0x3: Vector: 700 (Program Check) at [c0000000e9a43960]
pc: c0000000000f1454: .expand_files+0x1f4/0x354
lr: c0000000000f1420: .expand_files+0x1c0/0x354
sp: c0000000e9a43be0
msr: 8000000000029032
current = 0xc0000000e8b4a810
paca = 0xc000000000482b80
pid = 26003, comm = creat05
kernel BUG in copy_fdtable at fs/file.c:138!
enter ? for help
[c0000000e9a43c80] c0000000000d0af8 .get_unused_fd+0x80/0x170
[c0000000e9a43d20] c0000000000d1094 .do_sys_open+0x54/0x12c
[c0000000e9a43dc0] c0000000000161b8 .compat_sys_creat+0x14/0x28
[c0000000e9a43e30] c00000000000871c syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 000000000ff6e3b0
SP (ff9cf310) is in userspace
3:mon>-- 0:conmux-control -- time-stamp -- Oct/10/06 9:40:19 --
-- 0:conmux-control -- time-stamp -- Oct/10/06 9:48:51 --
Michal Piotrowski
2006-10-11 20:10:44 UTC
Permalink
Post by Martin J. Bligh
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
-
Oh, and hangs in LTP.
It's probably a problem with fdtable patches
http://www.ussg.iu.edu/hypermail/linux/kernel/0610.1/0925.html

Regards,
Michal
--
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/)
Andrew Morton
2006-10-11 21:47:13 UTC
Permalink
On Wed, 11 Oct 2006 12:59:19 -0700
Post by Martin J. Bligh
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
-
Oh, and hangs in LTP.
x86_64 just hangs.
http://test.kernel.org/abat/54544/debug/test.log.1 (in something io-ish)
What makes you thing it was something io-ish?
Post by Martin J. Bligh
http://test.kernel.org/abat/54541/debug/test.log.1 (ppc64)
craps itself with
There's been a fix for this in hot-fixes/ for 24 hours. It'd be good if you
could tinkle the scripts to pull that directory in.

Or just suck the -mm git tree. That incorprates additions to hot-fixes/ within
five minutes.
Andy Whitcroft
2006-10-12 10:22:47 UTC
Permalink
Post by Andrew Morton
On Wed, 11 Oct 2006 12:59:19 -0700
Post by Martin J. Bligh
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
-
Oh, and hangs in LTP.
x86_64 just hangs.
http://test.kernel.org/abat/54544/debug/test.log.1 (in something io-ish)
What makes you thing it was something io-ish?
Post by Martin J. Bligh
http://test.kernel.org/abat/54541/debug/test.log.1 (ppc64)
craps itself with
There's been a fix for this in hot-fixes/ for 24 hours. It'd be good if you
could tinkle the scripts to pull that directory in.
Or just suck the -mm git tree. That incorprates additions to hot-fixes/ within
five minutes.
I have to say I always forget its there, debug and fix it only to find
its in hotfixes, grr. So having the test system notice and fire new
jobs off with these too is the Right Thing (tm).

I've hopefully modified the test system to take hot-fixes into account.
I do wonder if there should be a series file in this directory, as we
have no ordering otherwise and it would simplify the detection and
application of these patches at my end for sure.

Anyhow, hopefully we'll get some results in the next 4 hours out to TKO
and see how it looks ... assuming they don't all go "what patch, boom,
crash".

-apw
Badari Pulavarty
2006-10-12 18:09:49 UTC
Permalink
Post by Andrew Morton
On Wed, 11 Oct 2006 12:59:19 -0700
Post by Martin J. Bligh
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
-
Oh, and hangs in LTP.
x86_64 just hangs.
http://test.kernel.org/abat/54544/debug/test.log.1 (in something io-ish)
What makes you thing it was something io-ish?
create05 test hang goes away with hot-fix (revert-fd-table stuff).
FYI.

Thanks,
Badari
Vadim Lobanov
2006-10-12 18:52:19 UTC
Permalink
Post by Badari Pulavarty
create05 test hang goes away with hot-fix (revert-fd-table stuff).
FYI.
Does it also go away with the "Eradicate fdarray overflow" patch instead of
the hot-fix?
Post by Badari Pulavarty
Thanks,
Badari
-- Vadim Lobanov
Badari Pulavarty
2006-10-12 19:01:00 UTC
Permalink
Post by Vadim Lobanov
Post by Badari Pulavarty
create05 test hang goes away with hot-fix (revert-fd-table stuff).
FYI.
Does it also go away with the "Eradicate fdarray overflow" patch instead of
the hot-fix?
I just verified. Test hang goes away with your overflow fix.

Thanks,
Badari
Michael Lothian
2006-10-11 21:19:16 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
git-libata-all.patch
Hi

I think I've found a regression in the pata-via module. My cdrom drive isn't
detected I have compiled in and also tried as modules pata-via and the SCSI
CDROM device driver

I think the problem may be due to my cdrom not having a jumper setting
either master or slave (or even cable select for that matter) as I lost the
wee jumper.

Even if this is the case I'd still call this a regression as the old code
finds it no problem

My dmesg states:

pata_via 0000:00:0f.1: version 0.1.14
ata3: PATA max UDMA/133 cmd 0x1F0 ctl 0x3F6 bmdma 0xA400 irq 14
ata4: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0xA408 irq 15
scsi2 : pata_via
ata3.00: ATAPI, max UDMA/33
EXT3-fs: mounted filesystem with ordered data mode.
ata3.00: qc timeout (cmd 0xa1)
ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata3.00: revalidation failed (errno=-5)
ata3.00: limiting speed to UDMA/25
ata3: failed to recover some devices, retrying in 5 secs
ata3.00: qc timeout (cmd 0xa1)
ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata3.00: revalidation failed (errno=-5)
ata3: failed to recover some devices, retrying in 5 secs
ata3.00: qc timeout (cmd 0xa1)
ata3.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata3.00: revalidation failed (errno=-5)
ata3.00: disabled
scsi3 : pata_via
ATA: abnormal status 0x8 on port 0x177


Does anyone have any ideas on why this is the case?

Cheers for any help you can offer

Mike

PS Apologies for the 3rd time your receiving this andrew
Helge Hafting
2006-10-12 12:18:04 UTC
Permalink
I found an easy way to hang the kernel when copying a SD-card:

dd if=/dev/sdc of=file bs=1048576

I.e. copy the entire 256MB card in 1MB chunks. I got about
160MB before the kernel hung. Not even sysrq+B worked, I needed
the reset button. The pc has a total of 512MB memory if that matters.

Using bs=4096 instead let me copy the entire card with no problems,
but that seems to progress slower.

The above 'dd' command hangs my office pc every time. So I can repeat
it for debugging purposes.


Helge Hafting
Andrew Morton
2006-10-12 18:29:38 UTC
Permalink
On Thu, 12 Oct 2006 14:18:04 +0200
Post by Helge Hafting
dd if=/dev/sdc of=file bs=1048576
I.e. copy the entire 256MB card in 1MB chunks. I got about
160MB before the kernel hung. Not even sysrq+B worked, I needed
the reset button. The pc has a total of 512MB memory if that matters.
Using bs=4096 instead let me copy the entire card with no problems,
but that seems to progress slower.
The above 'dd' command hangs my office pc every time. So I can repeat
it for debugging purposes.
What device driver is providing /dev/sdc?

Did any previous kernels work correctly? If so, which?
Helge Hafting
2006-10-13 13:11:11 UTC
Permalink
Post by Andrew Morton
On Thu, 12 Oct 2006 14:18:04 +0200
Post by Helge Hafting
dd if=/dev/sdc of=file bs=1048576
I.e. copy the entire 256MB card in 1MB chunks. I got about
160MB before the kernel hung. Not even sysrq+B worked, I needed
the reset button. The pc has a total of 512MB memory if that matters.
Using bs=4096 instead let me copy the entire card with no problems,
but that seems to progress slower.
The above 'dd' command hangs my office pc every time. So I can repeat
it for debugging purposes.
What device driver is providing /dev/sdc?
It is an usb card reader, so it is "usb mass storage"
and "scsi disk".
Post by Andrew Morton
Did any previous kernels work correctly? If so, which?
I just got that card reader, so I haven't tested any earlier kernels.
I have another machine with a card reader, which I have used for
a long time. But I only ever copy files with "cp" on that one.

This time I used "dd" to get an image of the entire card, and got trouble
when using 1M chunks.

I can try with verbose scsi debug messages if that might help?

Helge Hafting
Andrew Morton
2006-10-13 16:29:41 UTC
Permalink
On Fri, 13 Oct 2006 15:11:11 +0200
Post by Helge Hafting
Post by Andrew Morton
On Thu, 12 Oct 2006 14:18:04 +0200
Post by Helge Hafting
dd if=/dev/sdc of=file bs=1048576
I.e. copy the entire 256MB card in 1MB chunks. I got about
160MB before the kernel hung. Not even sysrq+B worked, I needed
the reset button. The pc has a total of 512MB memory if that matters.
Using bs=4096 instead let me copy the entire card with no problems,
but that seems to progress slower.
The above 'dd' command hangs my office pc every time. So I can repeat
it for debugging purposes.
What device driver is providing /dev/sdc?
It is an usb card reader, so it is "usb mass storage"
and "scsi disk".
Post by Andrew Morton
Did any previous kernels work correctly? If so, which?
I just got that card reader, so I haven't tested any earlier kernels.
I have another machine with a card reader, which I have used for
a long time. But I only ever copy files with "cp" on that one.
This time I used "dd" to get an image of the entire card, and got trouble
when using 1M chunks.
I can try with verbose scsi debug messages if that might help?
Maybe. The first step is to tell the developers. (adds cc).


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
linux-usb-***@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
Alan Stern
2006-10-13 18:10:34 UTC
Permalink
Post by Andrew Morton
On Fri, 13 Oct 2006 15:11:11 +0200
Post by Helge Hafting
Post by Andrew Morton
On Thu, 12 Oct 2006 14:18:04 +0200
Post by Helge Hafting
dd if=/dev/sdc of=file bs=1048576
I.e. copy the entire 256MB card in 1MB chunks. I got about
160MB before the kernel hung. Not even sysrq+B worked, I needed
the reset button. The pc has a total of 512MB memory if that matters.
Using bs=4096 instead let me copy the entire card with no problems,
but that seems to progress slower.
The above 'dd' command hangs my office pc every time. So I can repeat
it for debugging purposes.
What device driver is providing /dev/sdc?
It is an usb card reader, so it is "usb mass storage"
and "scsi disk".
Post by Andrew Morton
Did any previous kernels work correctly? If so, which?
I just got that card reader, so I haven't tested any earlier kernels.
I have another machine with a card reader, which I have used for
a long time. But I only ever copy files with "cp" on that one.
This time I used "dd" to get an image of the entire card, and got trouble
when using 1M chunks.
I can try with verbose scsi debug messages if that might help?
Verbose usb-storage debugging messages would help more
(CONFIG_USB_STORAGE_DEBUG and CONFIG_USB_DEBUG). If the kernel hangs very
badly you might need to use a serial console to capture all the logging
information.

Alan Stern


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
linux-usb-***@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
Helge Hafting
2006-10-18 09:31:41 UTC
Permalink
Post by Alan Stern
Verbose usb-storage debugging messages would help more
(CONFIG_USB_STORAGE_DEBUG and CONFIG_USB_DEBUG). If the kernel hangs very
badly you might need to use a serial console to capture all the logging
information.
Version information first: This is 2.6.19-rc1, not mm1. I apparently
forgot to apply the mm1 patch before compiling it.

I got a BUG, which I could write down by getting X out of the way first.
It is repeatable, just ask if I omitted something cruical. On bootup,
the verbose debugging complains about read errors on sdc,
I guess the kernel tries to get the partition table. I have no idea
why there is read errors - that shouldn't hang anything though.

To bring it down:

dd if=/dev/sdc of=sdc.dump bs=1M

sd 0:0:0:2 ioctl_internal_command return code: 8000002
:Current: Sense key: Hardware Error
Additional Sense: End_of_data detected
cut here----
Kernel BUG at [Verbose debugging unavailable]
invalid opcode: 0000 [#1]
cpu:0
EIP: 0060:[<c031f823>] Not tainted VLI
Eflags: 00010002 (2.6.16-rc1 #16)
EIP is at start_unlink_async
eax:00000000 ebx:dfe69180 ecx:e0832020 edx:00000005
esi:dffdb6bc edi:00010021 ebp:dffdb6bc esp:c0664d58
ds:007b es:007b ss:0068
Process swapper . . .
stack . . .
Call trace
ehci_urb_dequeue
unlink1
usb_hcd_unlink_urb
sg_complete
usb_hcd_giveback_urb
qh_completions
ehci_work
ehci_irq
usb_hcd_irq
handle_IRQ_event
handle_fasteoi_irq
do_IRQ

Code 5d e9 8e 31 ff ff f6 43 28 01 75 b8 c7 43 24 00 00 00 00 eb af . . .
<0>Kernel Panic - not syncing fatal exception in interrupt
<0>Rebooting in 300 seconds

It did reboot in 300 seconds, I had to crash twice to get this much written down.
I checked that stuff written down the first time was identical.

Invalid opcode suggest a compiler bug or memory scribble, or possibly
calling a bad function pointer. The crash is trivial to reproduce,
just ask me.

In case it matters:
$ lspci
00:00.0 Host bridge: Silicon Integrated Systems [SiS] SiS645DX Host & Memory & AGP Controller
00:01.0 PCI bridge: Silicon Integrated Systems [SiS] Virtual PCI-to-PCI bridge (AGP)
00:02.0 ISA bridge: Silicon Integrated Systems [SiS] SiS962 [MuTIOL Media IO] (rev 04)
00:02.1 SMBus: Silicon Integrated Systems [SiS] SiS961/2 SMBus Controller
00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE]
00:02.7 Multimedia audio controller: Silicon Integrated Systems [SiS] AC'97 Sound Controller (rev a0)
00:03.0 USB Controller: Silicon Integrated Systems [SiS] USB 1.0 Controller (rev 0f)
00:03.1 USB Controller: Silicon Integrated Systems [SiS] USB 1.0 Controller (rev 0f)
00:03.2 USB Controller: Silicon Integrated Systems [SiS] USB 1.0 Controller (rev 0f)
00:03.3 USB Controller: Silicon Integrated Systems [SiS] USB 2.0 Controller
00:0b.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado] (rev 78)
00:0c.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10)
01:00.0 VGA compatible controller: ATI Technologies Inc Radeon RV100 QY [Radeon 7000/VE]


Helge


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
linux-usb-***@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
Alan Stern
2006-10-18 16:26:10 UTC
Permalink
Post by Helge Hafting
Post by Alan Stern
Verbose usb-storage debugging messages would help more
(CONFIG_USB_STORAGE_DEBUG and CONFIG_USB_DEBUG). If the kernel hangs very
badly you might need to use a serial console to capture all the logging
information.
Version information first: This is 2.6.19-rc1, not mm1. I apparently
forgot to apply the mm1 patch before compiling it.
I got a BUG, which I could write down by getting X out of the way first.
It is repeatable, just ask if I omitted something cruical. On bootup,
the verbose debugging complains about read errors on sdc,
I guess the kernel tries to get the partition table. I have no idea
why there is read errors - that shouldn't hang anything though.
That's why I asked for the USB debugging logs (which you forgot to include
here).
Post by Helge Hafting
dd if=/dev/sdc of=sdc.dump bs=1M
sd 0:0:0:2 ioctl_internal_command return code: 8000002
:Current: Sense key: Hardware Error
Additional Sense: End_of_data detected
cut here----
Kernel BUG at [Verbose debugging unavailable]
invalid opcode: 0000 [#1]
cpu:0
EIP: 0060:[<c031f823>] Not tainted VLI
Eflags: 00010002 (2.6.16-rc1 #16)
Hmmm. Well, a recent patch affecting that area was just reverted because
it added some problems. Maybe you're seeing those same problems.
Although I don't think they involved invalid opcode errors...

You're the second person I've seen report invalid opcode errors in the
recent kernels. The other report involved uhci-hcd, not ehci-hcd. See
here:

http://marc.theaimsgroup.com/?l=linux-usb-users&m=115942141207661&w=2

It's possible that both of these are caused by something unrelated
overwriting kernel memory.

By the way, what happens if you add a "skip=" argument to dd so that the
copy begins near the end of the device? Does the oops then occur that
much sooner?

Oh, and the next time this happens, could you copy down all of the code
bytes from the oops message? And also provide the section from "objdump
-d drivers/usb/host/ehci-hcd.o" for the start_unlink_async routine?

Alan Stern
Helge Hafting
2006-10-19 12:25:40 UTC
Permalink
[...]
Post by Alan Stern
That's why I asked for the USB debugging logs (which you forgot to include
here).
Attached dmesg.gz with lots of usb messages.
Post by Alan Stern
Post by Helge Hafting
dd if=/dev/sdc of=sdc.dump bs=1M
This time, it seems to have crashed on the first megabyte.
I mounted the filesystem synchronously, and still I had 0 bytes
in the dumpfile. The crash also came with no delay after
pressing enter.
Post by Alan Stern
It's possible that both of these are caused by something unrelated
overwriting kernel memory.
something like a function pointer mistaken for a data pointer?
Post by Alan Stern
By the way, what happens if you add a "skip=" argument to dd so that the
copy begins near the end of the device? Does the oops then occur that
much sooner?
No, it is random. May happen immediately, may happen after a while.
I even had "cfdisk /dev/sdc" crash on me fresh after a reboot.
Post by Alan Stern
Oh, and the next time this happens, could you copy down all of the code
bytes from the oops message? And also provide the section from "objdump
-d drivers/usb/host/ehci-hcd.o" for the start_unlink_async routine?
objdump for start_unlink_async attached.

From the BUG:

Stack (All I got before it rebooted after 300s)
00000010 c0664dc8 dff84000 dffdbc00 dffdb600 00000296
df9244c0 c03248de c0664dc8

EIP: [<c031f823>] start_unlink_async+0x16/0xf2
SS:ESP:0068:c0664d58



Code (Complete) 5d e9 8e 31 ff ff f6 43 28 01 75 b8 c7 43 24 00 00 00 00
eb af
57 56 53 83 ec 10 89 c6 89 d3 8b 48 04 8b 39 8b 40 14 85 c0 74
6f <0f> 0b 39 5e 10 74 78 c6 43 68 02 8d 43 60 e8 9f 3c f1 ff 89 5e

I found this in the start_unlink_async dump - here it is with the
same line breaking as well as the differences:
{Before start_unlink_async}
5d
e9 8e 31 ff ff ; objdump has "e9 fc ff ff ff" here, it is a jump
f6 43 28 01
75 b8
c7 43 24 00 00 00 00
eb af
start_unlink_async
57
56
53
83 ec 10
89 c6
89 d3
8b 48 04
8b 39
8b 40 14
85 c0
74 6f
0f 0b
39 5e 10
74 78
c6 43 68 02
8d 43 60
e8 9f 3c f1 ff ; objdump has "e8 fc ff ff ff" here, a call
89 5e

Calls and jumps are different, but I guess that is just linking effects?

Hope this is useful,
Helge Hafting
Alan Stern
2006-10-19 18:40:17 UTC
Permalink
Post by Helge Hafting
[...]
Post by Alan Stern
That's why I asked for the USB debugging logs (which you forgot to include
here).
Attached dmesg.gz with lots of usb messages.
But no messages from the time just before the BUG occurred. :-(
Post by Helge Hafting
Post by Alan Stern
Post by Helge Hafting
dd if=/dev/sdc of=sdc.dump bs=1M
This time, it seems to have crashed on the first megabyte.
I mounted the filesystem synchronously, and still I had 0 bytes
in the dumpfile. The crash also came with no delay after
pressing enter.
Post by Alan Stern
It's possible that both of these are caused by something unrelated
overwriting kernel memory.
something like a function pointer mistaken for a data pointer?
After looking at the debugging output, no. That "invalid opcode" is a red
herring. What you encountered this time was a BUG() in the source code of
start_unlink_async() in drivers/usb/host/ehci-q.c:

#ifdef DEBUG
assert_spin_locked(&ehci->lock);
if (ehci->reclaim
|| (qh->qh_state != QH_STATE_LINKED
&& qh->qh_state != QH_STATE_UNLINK_WAIT)
)
BUG ();
#endif

You could try putting a printk() just before the BUG() to display the
values of ehci->reclaim and qh->qh_state. Maybe also change the BUG() to
WARN(), which might help prevent your system from crashing so badly.

Monty has been making changes to this driver recently; maybe he has some
ideas about the problem.

Alan Stern


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
linux-usb-***@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
Christopher "Monty" Montgomery
2006-10-19 18:57:56 UTC
Permalink
Post by Alan Stern
Monty has been making changes to this driver recently; maybe he has some
ideas about the problem.
I have been watching the thread worried that this is due to a change
I've made. However, I should not have done anything to change
handling on the async queue-- at least, I've not made any changes
intentionally, which is not the same thing as not making any changes.

I'll also be interested to see the result of the additional debug message.

Monty

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
linux-usb-***@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
Helge Hafting
2006-10-20 11:44:09 UTC
Permalink
Alan Stern wrote:
[...]
Post by Alan Stern
After looking at the debugging output, no. That "invalid opcode" is a red
herring. What you encountered this time was a BUG() in the source code of
#ifdef DEBUG
assert_spin_locked(&ehci->lock);
if (ehci->reclaim
|| (qh->qh_state != QH_STATE_LINKED
&& qh->qh_state != QH_STATE_UNLINK_WAIT)
)
BUG ();
#endif
You could try putting a printk() just before the BUG() to display the
values of ehci->reclaim and qh->qh_state. Maybe also change the BUG() to
ehci->reclaim=0
qh->qh_state=5
Post by Alan Stern
WARN(), which might help prevent your system from crashing so badly.
WARN didn't help much. I then got the warning twice, followed by
another BUG:
process klogd
ehci_irq
usb_hcd_irq
handle_IRQ_event
handle_fasteio_irq
do_IRQ

So I set it back to BUG. Crashing hard isn't so bad when I
know what is coming - I simply remount everything synchronously
before trying.

I hope these printk's help. I can add more of them too, if needed.
Big transfers seems to bring out the worst - I always get the
crash on the first megabyte now.

During boot I get lots of those "Hardware error, end-of-data detected"
messages, but I've never seen it crash during bootup.

Helge Hafting

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
linux-usb-***@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
Alan Stern
2006-10-20 15:55:44 UTC
Permalink
Post by Helge Hafting
[...]
Post by Alan Stern
After looking at the debugging output, no. That "invalid opcode" is a red
herring. What you encountered this time was a BUG() in the source code of
#ifdef DEBUG
assert_spin_locked(&ehci->lock);
if (ehci->reclaim
|| (qh->qh_state != QH_STATE_LINKED
&& qh->qh_state != QH_STATE_UNLINK_WAIT)
)
BUG ();
#endif
You could try putting a printk() just before the BUG() to display the
values of ehci->reclaim and qh->qh_state. Maybe also change the BUG() to
ehci->reclaim=0
qh->qh_state=5
5 is QH_STATE_COMPLETING. That explains why the BUG() fires.

At this point it's beyond me. Monty will have to take it from here.
Post by Helge Hafting
During boot I get lots of those "Hardware error, end-of-data detected"
messages, but I've never seen it crash during bootup.
Those messages are from the card reader. It doesn't seem to be working
right. It returns the "end-of-data" error in response to a PREVENT MEDIUM
REMOVAL command and it returns a phase error in response to a READ
command. In spite of the fact that it claims to have a 256 MB card
present.

Alan Stern


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
linux-usb-***@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel
Helge Hafting
2006-10-23 09:12:30 UTC
Permalink
Post by Alan Stern
Post by Helge Hafting
Post by Alan Stern
You could try putting a printk() just before the BUG() to display the
values of ehci->reclaim and qh->qh_state. Maybe also change the BUG() to
ehci->reclaim=0
qh->qh_state=5
5 is QH_STATE_COMPLETING. That explains why the BUG() fires.
At this point it's beyond me. Monty will have to take it from here.
Post by Helge Hafting
During boot I get lots of those "Hardware error, end-of-data detected"
messages, but I've never seen it crash during bootup.
Those messages are from the card reader. It doesn't seem to be working
right. It returns the "end-of-data" error in response to a PREVENT MEDIUM
REMOVAL command
Unlike a cdrom, it doesn't have the means to prevent media removal. :-)
Post by Alan Stern
and it returns a phase error in response to a READ
command. In spite of the fact that it claims to have a 256 MB card
present.
It has slots for several different cards, all the other
slots are empty.

Perhaps it is broken, but interesting as a "stress-test".
Linux should not crash because of a bad usb thing, just complain.

Helge Hafting
Alan Stern
2006-10-23 14:13:39 UTC
Permalink
Post by Helge Hafting
Post by Alan Stern
Post by Helge Hafting
During boot I get lots of those "Hardware error, end-of-data detected"
messages, but I've never seen it crash during bootup.
Those messages are from the card reader. It doesn't seem to be working
right. It returns the "end-of-data" error in response to a PREVENT MEDIUM
REMOVAL command
Unlike a cdrom, it doesn't have the means to prevent media removal. :-)
There's nothing wrong with that; lots of devices don't have the means to
prevent media removal. The proper response is "Invalid Command", not "End
of Data". If the device sent the proper reply you wouldn't get all those
"Hardware error, end-of-data detected" messages in the log.
Post by Helge Hafting
Post by Alan Stern
and it returns a phase error in response to a READ
command. In spite of the fact that it claims to have a 256 MB card
present.
It has slots for several different cards, all the other
slots are empty.
Were all the slots empty? If yes, the reader should not have indicated a
card was present in that slot. If no, the reader should have returned the
data from the card instead of a phase error. Either way, it misbehaved.
Post by Helge Hafting
Perhaps it is broken, but interesting as a "stress-test".
Linux should not crash because of a bad usb thing, just complain.
Linux did not crash because of the bad reader; it crashed because of an
unrelated bug in ehci-hcd. (Although it's possible that the bug was
triggered by the error-recovery for the bad reader.)

Alan Stern
Helge Hafting
2006-10-24 10:16:46 UTC
Permalink
Post by Alan Stern
At this point it's beyond me. Monty will have to take it from here.
I will look more closely at what might have changed there. Despite
the code refactoring (and a hand-resolved patch collision at that
point) the async disable handling *should* have been functionally
unchanged from 2.6.18. I will revisit that closely.
Has it actually been demonstrated that this does not crash 2.6.18
(pre-my-patches) kernels?
I just tested this. 2.6.18 does not crash. I still get tons of errors,
and no data. Copying using 1MB chunks or 4kB chunks
don't matter, it doesn't work. So card, reader or driver must be faulty.
The card works in a windows machine though.

2.6.19-rc1 gets data with 4kB chunks, and BUGs with 1M chunks.
If it crashes earlier, that doesn't mean
I'm uninterested in fixing it, I just want to know. I don't think
that had been explicitly answered earlier in the thread.
2.6.18 wasn't tried before, the reason being I did not have this
card reader when 2.6.18 was current.

dmesg output for 2.6.18 (after the dd attempts) is attached. I have
edited out stuff that isn't usb or scsi.

Helge Hafting
Alan Stern
2006-10-24 14:09:05 UTC
Permalink
Post by Helge Hafting
I just tested this. 2.6.18 does not crash. I still get tons of errors,
and no data. Copying using 1MB chunks or 4kB chunks
don't matter, it doesn't work. So card, reader or driver must be faulty.
The card works in a windows machine though.
2.6.19-rc1 gets data with 4kB chunks, and BUGs with 1M chunks.
It would be interesting to compare 2.6.18 with 2.6.19-rc to see why the
first gets only errors while the second is able to transfer some data
using 4 KB chunks.

(By the way, what do you mean by 4 KB chunks or 1 MB chunks? Does this
refer to the bs= option for dd? That has almost nothing to do with the
size of the transfers actually sent to the device.)

But the log will be useless unless you turn on CONFIG_USB_STORAGE_DEBUG.

Alan Stern

V***@vt.edu
2006-10-12 18:37:39 UTC
Permalink
Post by Andrew Morton
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/
- Added the high-resolution timers and dynamic-ticks code. Please be sure
up.
Compiles, boots, and behaves on my Dell Latitude C840 that previously had
indigestion. It selected the ACPI-PM timesource right off the bat (for reasons
I don't understand, previous dynticks used the tsc timesource), so I'm not
seeing the huge clock drift issues I had with previous dyntick patches when
running 'cpuspeed' - it drops from 1.6Ghz to 1.2Ghz and back without a problem.
Loading...