Discussion:
Per-open file private data for the cdevs
Kostik Belousov
2008-05-04 17:10:02 UTC
Permalink
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,

The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.

devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.

devfs_get_cdevpriv is the obvious accessor.

devfs_clear_cdevpriv allows to clear the private data for the still
open file.

The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.

Patch:
http://people.freebsd.org/~kib/misc/fdpriv.1.patch

Dumb driver that shows the basic usage of the proposed KPI:
http://people.freebsd.org/~kib/misc/fpclone.c

Previous version of the patch was tested by Peter Holm.
Ed Schouten
2008-05-05 07:49:24 UTC
Permalink
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
I also thought about this. The new TTY layer I'm developing needs the
following patch to implement /dev/ptmx and /dev/ptyXX compatibility:

--- sys/fs/devfs/devfs_vnops.c
+++ sys/fs/devfs/devfs_vnops.c
@@ -800,9 +800,8 @@
if(fp == NULL)
return (error);
#endif
- KASSERT(fp->f_ops == &badfileops,
- ("Could not vnode bypass device on fdops %p", fp->f_ops));
- finit(fp, fp->f_flag, DTYPE_VNODE, dev, &devfs_ops_f);
+ if (fp->f_ops == &badfileops)
+ finit(fp, fp->f_flag, DTYPE_VNODE, dev, &devfs_ops_f);
return (error);
}

This way drivers can just implement d_fdopen() and call finit() there.
It's probably not as nice as having the per-fdesc stuff inside devfs
itself, but I'm not sure the amount of drivers that needs this makes it
worth adding it to devfs itself.
--
Ed Schouten <***@80386.nl>
WWW: http://80386.nl/
John Baldwin
2008-05-05 13:30:17 UTC
Permalink
Post by Ed Schouten
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time
ago mostly says that it would be better to implement per-file private
data instead, I produced the patch along this line,
I also thought about this. The new TTY layer I'm developing needs the
--- sys/fs/devfs/devfs_vnops.c
+++ sys/fs/devfs/devfs_vnops.c
@@ -800,9 +800,8 @@
if(fp == NULL)
return (error);
#endif
- KASSERT(fp->f_ops == &badfileops,
- ("Could not vnode bypass device on fdops %p", fp->f_ops));
- finit(fp, fp->f_flag, DTYPE_VNODE, dev, &devfs_ops_f);
+ if (fp->f_ops == &badfileops)
+ finit(fp, fp->f_flag, DTYPE_VNODE, dev, &devfs_ops_f);
return (error);
}
This way drivers can just implement d_fdopen() and call finit() there.
It's probably not as nice as having the per-fdesc stuff inside devfs
itself, but I'm not sure the amount of drivers that needs this makes it
worth adding it to devfs itself.
Many drivers currently do devfs cloning soley to get per-file data. Other
OS's (such as WinXP and Linux) already provide facilities for drivers to set
per-file data as well. This is definitely very useful.
--
John Baldwin
Pawel Jakub Dawidek
2008-05-05 08:13:55 UTC
Permalink
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Can you see if my OSD (Object-Specific-Data) KPI can be useful here?
I've it only in perforce for now, but I think it's what you are looking
for. I use it for jails and threads currently, but it is trivial to use
it for other objects. Take a look:

http://perforce.freebsd.org/fileViewer.cgi?FSPC=//depot/user/pjd/zfs/sys/sys/osd.h&REV=3

When your code is loaded/initialized you call:

static int slot;

slot = osd_file_register(mod_destroy);

Where mod_destroy is a function which knows how to free your private
data. On unload:

osd_file_deregister(slot);

Now, when you want to attach private data:

error = osd_file_set(fp, slot, ptr_to_your_data);

You can get it with:

ptr = osd_file_get(fp, slot);
--
Pawel Jakub Dawidek http://www.wheel.pl
***@FreeBSD.org http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!
Kostik Belousov
2008-05-05 14:56:31 UTC
Permalink
Post by Pawel Jakub Dawidek
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Can you see if my OSD (Object-Specific-Data) KPI can be useful here?
I've it only in perforce for now, but I think it's what you are looking
for. I use it for jails and threads currently, but it is trivial to use
http://perforce.freebsd.org/fileViewer.cgi?FSPC=//depot/user/pjd/zfs/sys/sys/osd.h&REV=3
static int slot;
slot = osd_file_register(mod_destroy);
Where mod_destroy is a function which knows how to free your private
osd_file_deregister(slot);
error = osd_file_set(fp, slot, ptr_to_your_data);
ptr = osd_file_get(fp, slot);
This is a nice feature, but I think that privdata is a fundamental
internal operation of the kernel, and it would be wrong to use
the OSD. In fact, the object we shall attach OSD to is the struct
file, and we would need to export the struct file to the cdevsw operations.
Since it both breaks current ABI and complicates the future changes to
the struct file, I would much prefer to not engage this route.
Pawel Jakub Dawidek
2008-05-06 06:12:15 UTC
Permalink
Post by Kostik Belousov
Post by Pawel Jakub Dawidek
Can you see if my OSD (Object-Specific-Data) KPI can be useful here?
I've it only in perforce for now, but I think it's what you are looking
for. I use it for jails and threads currently, but it is trivial to use
http://perforce.freebsd.org/fileViewer.cgi?FSPC=//depot/user/pjd/zfs/sys/sys/osd.h&REV=3
static int slot;
slot = osd_file_register(mod_destroy);
Where mod_destroy is a function which knows how to free your private
osd_file_deregister(slot);
error = osd_file_set(fp, slot, ptr_to_your_data);
ptr = osd_file_get(fp, slot);
This is a nice feature, but I think that privdata is a fundamental
internal operation of the kernel, and it would be wrong to use
the OSD. [...]
OSD is there to be used mostly by optional modules. For example if ZFS
is not loaded there is no need to keep a pointer to its private data in
struct prison and struct thread.
Post by Kostik Belousov
[...] In fact, the object we shall attach OSD to is the struct
file, and we would need to export the struct file to the cdevsw operations.
Since it both breaks current ABI and complicates the future changes to
the struct file, I would much prefer to not engage this route.
Ok.
--
Pawel Jakub Dawidek http://www.wheel.pl
***@FreeBSD.org http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!
John Baldwin
2008-05-05 13:39:42 UTC
Permalink
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
I like this very much and will use it instead of devfs cloning in ipmi(4) so
we can use ipmievd when it is committed. IWBN if there was a more automated
way of handling the unload race, for example if destroy_dev() could somehow
clear all the per-open fd data. That may not be easily feasible, however.
(In theory each cdev could have a list of "attached" 'struct file' objects
that it updates in VOP_CLOSE() and for a destroy dev it detaches all the
private data after marking the cdev with a bad/null cdevsw, however, that
would bloat 'struct file' with at least one more pointer (if not two).)
--
John Baldwin
Kostik Belousov
2008-05-05 14:20:51 UTC
Permalink
Post by John Baldwin
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
I like this very much and will use it instead of devfs cloning in ipmi(4) so
we can use ipmievd when it is committed. IWBN if there was a more automated
way of handling the unload race, for example if destroy_dev() could somehow
clear all the per-open fd data. That may not be easily feasible, however.
(In theory each cdev could have a list of "attached" 'struct file' objects
that it updates in VOP_CLOSE() and for a destroy dev it detaches all the
private data after marking the cdev with a bad/null cdevsw, however, that
would bloat 'struct file' with at least one more pointer (if not two).)
Probably, I shall clarify that the dtr is called when the last reference
to the struct file going away, unless the priv data is already cleared
by the call to the devfs_clear_cdevpriv.

The problem with VOP_CLOSE() is that some actions (like forced unmount
or revoke) may cause less calls to the devfs_close then the files
pointing to the cdev. Your proposal basically means that we need,
besides the normal pointers file->vnode->cdev, have the reverse path
cdev->file. I think this is ugly and better be handled by the fdrop().
John Baldwin
2008-05-05 15:56:14 UTC
Permalink
Post by Kostik Belousov
Post by John Baldwin
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
I like this very much and will use it instead of devfs cloning in ipmi(4) so
we can use ipmievd when it is committed. IWBN if there was a more automated
way of handling the unload race, for example if destroy_dev() could somehow
clear all the per-open fd data. That may not be easily feasible, however.
(In theory each cdev could have a list of "attached" 'struct file' objects
that it updates in VOP_CLOSE() and for a destroy dev it detaches all the
private data after marking the cdev with a bad/null cdevsw, however, that
would bloat 'struct file' with at least one more pointer (if not two).)
Probably, I shall clarify that the dtr is called when the last reference
to the struct file going away, unless the priv data is already cleared
by the call to the devfs_clear_cdevpriv.
The problem with VOP_CLOSE() is that some actions (like forced unmount
or revoke) may cause less calls to the devfs_close then the files
pointing to the cdev. Your proposal basically means that we need,
besides the normal pointers file->vnode->cdev, have the reverse path
cdev->file. I think this is ugly and better be handled by the fdrop().
The disconnect as I see it is that right now destroy_dev() "orphans" devices
in that the files still exist but the backing cdev's now have dead
operations. However, now you can have a partially orphaned cdev in that not
all of the cdev state is torn down in destroy_dev(). In that sense it would
be nice to have the behavior described above, and yes you would have to have
the cdev's keep track of 'file's as I suggested.

Could you handle the revoke(2) and forced umount cases either via VOP_REVOKE()
or by adding a VOP_INACTIVE() to devfs?
--
John Baldwin
Kostik Belousov
2008-05-05 20:40:14 UTC
Permalink
Post by Kostik Belousov
Post by Kostik Belousov
Post by John Baldwin
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time
ago
Post by Kostik Belousov
Post by John Baldwin
Post by Kostik Belousov
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file
descriptor
Post by Kostik Belousov
Post by John Baldwin
Post by Kostik Belousov
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
I like this very much and will use it instead of devfs cloning in ipmi(4)
so
Post by Kostik Belousov
Post by John Baldwin
we can use ipmievd when it is committed. IWBN if there was a more
automated
Post by Kostik Belousov
Post by John Baldwin
way of handling the unload race, for example if destroy_dev() could
somehow
Post by Kostik Belousov
Post by John Baldwin
clear all the per-open fd data. That may not be easily feasible, however.
(In theory each cdev could have a list of "attached" 'struct file' objects
that it updates in VOP_CLOSE() and for a destroy dev it detaches all the
private data after marking the cdev with a bad/null cdevsw, however, that
would bloat 'struct file' with at least one more pointer (if not two).)
Probably, I shall clarify that the dtr is called when the last reference
to the struct file going away, unless the priv data is already cleared
by the call to the devfs_clear_cdevpriv.
The problem with VOP_CLOSE() is that some actions (like forced unmount
or revoke) may cause less calls to the devfs_close then the files
pointing to the cdev. Your proposal basically means that we need,
besides the normal pointers file->vnode->cdev, have the reverse path
cdev->file. I think this is ugly and better be handled by the fdrop().
The disconnect as I see it is that right now destroy_dev() "orphans" devices
in that the files still exist but the backing cdev's now have dead
operations. However, now you can have a partially orphaned cdev in that not
all of the cdev state is torn down in destroy_dev(). In that sense it would
be nice to have the behavior described above, and yes you would have to have
the cdev's keep track of 'file's as I suggested.
Could you handle the revoke(2) and forced umount cases either via
VOP_REVOKE() or by adding a VOP_INACTIVE() to devfs?
I do not see how VOP_INACTIVE would be helpful there. Use of it still
means the vnode->file walker. And, I quite dislike the backreferences
like cdev->files or vnodes->files.

My previous attempt of the cloning at open() resolved the lifecycle issues
by making the per-fd data a cdev with the usual cdev management.

I thought about either incrementing si_threadcount, or adding another
refcounter to the cdev to track presence of the priv data. But
destroy_dev() then have to block, and driver author still required to
keep track the list of the allocated priv references to be able to
actively unblock, like the d_purge. I then decided to leave this to the
driver.
Jeff Roberson
2008-05-11 07:53:12 UTC
Permalink
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Hi Kostik,

Are these per-instances structures intended to be used by anything other
than devices? If not can we make them a union with the DTYPE_VNODE
fields to save space?

Thanks,
Jeff
Kostik Belousov
2008-05-11 11:50:30 UTC
Permalink
Post by Jeff Roberson
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Hi Kostik,
Are these per-instances structures intended to be used by anything other
than devices? If not can we make them a union with the DTYPE_VNODE
fields to save space?
Thanks,
Jeff
The current version of the patch is at
http://people.freebsd.org/~kib/misc/fdpriv.3.patch

Per insistence of John Baldwin and request of Eric Anholt, the destructors
are called now when either file is last closed, or the device is destroyed.
This versions adds only one pointer to the struct file.

Jeff, would you, please, explicitely specify what field you propose to
union with the f_cdevpriv ?
Kostik Belousov
2008-05-11 11:58:15 UTC
Permalink
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Hi Kostik,
Are these per-instances structures intended to be used by anything other
than devices? If not can we make them a union with the DTYPE_VNODE
fields to save space?
Thanks,
Jeff
The current version of the patch is at
http://people.freebsd.org/~kib/misc/fdpriv.3.patch
Per insistence of John Baldwin and request of Eric Anholt, the destructors
are called now when either file is last closed, or the device is destroyed.
This versions adds only one pointer to the struct file.
Jeff, would you, please, explicitely specify what field you propose to
union with the f_cdevpriv ?
Of course, I forgot to answer the question. Yes, the KPI is supposed to
be used by the drivers only. Please, note that device open files have
DTYPE_VNODE.
Jeff Roberson
2008-05-12 01:40:14 UTC
Permalink
Post by Kostik Belousov
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Hi Kostik,
Are these per-instances structures intended to be used by anything other
than devices? If not can we make them a union with the DTYPE_VNODE
fields to save space?
Thanks,
Jeff
The current version of the patch is at
http://people.freebsd.org/~kib/misc/fdpriv.3.patch
Per insistence of John Baldwin and request of Eric Anholt, the destructors
are called now when either file is last closed, or the device is destroyed.
This versions adds only one pointer to the struct file.
Jeff, would you, please, explicitely specify what field you propose to
union with the f_cdevpriv ?
f_nextoff and f_seqcount are only used if vn_read() and vn_write() are
used. They do not apply to any other descriptors.
Post by Kostik Belousov
Of course, I forgot to answer the question. Yes, the KPI is supposed to
be used by the drivers only. Please, note that device open files have
DTYPE_VNODE.
Kostik Belousov
2008-05-12 10:16:07 UTC
Permalink
Post by Jeff Roberson
Post by Kostik Belousov
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Hi Kostik,
Are these per-instances structures intended to be used by anything other
than devices? If not can we make them a union with the DTYPE_VNODE
fields to save space?
Thanks,
Jeff
The current version of the patch is at
http://people.freebsd.org/~kib/misc/fdpriv.3.patch
Per insistence of John Baldwin and request of Eric Anholt, the destructors
are called now when either file is last closed, or the device is destroyed.
This versions adds only one pointer to the struct file.
Jeff, would you, please, explicitely specify what field you propose to
union with the f_cdevpriv ?
f_nextoff and f_seqcount are only used if vn_read() and vn_write() are
used. They do not apply to any other descriptors.
I use the f_cdevpriv != NULL as an indicator for the necessity to enter
the cdevpriv code, in particular, locking the cdevpriv_mtx, that would
otherwise needed to be entered at each last close. I think that one
pointer for the struct file is not too big cost, do you agree ?
Post by Jeff Roberson
Post by Kostik Belousov
Of course, I forgot to answer the question. Yes, the KPI is supposed to
be used by the drivers only. Please, note that device open files have
DTYPE_VNODE.
Jeff Roberson
2008-05-12 20:03:14 UTC
Permalink
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some time ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Hi Kostik,
Are these per-instances structures intended to be used by anything other
than devices? If not can we make them a union with the DTYPE_VNODE
fields to save space?
Thanks,
Jeff
The current version of the patch is at
http://people.freebsd.org/~kib/misc/fdpriv.3.patch
Per insistence of John Baldwin and request of Eric Anholt, the destructors
are called now when either file is last closed, or the device is destroyed.
This versions adds only one pointer to the struct file.
Jeff, would you, please, explicitely specify what field you propose to
union with the f_cdevpriv ?
f_nextoff and f_seqcount are only used if vn_read() and vn_write() are
used. They do not apply to any other descriptors.
I use the f_cdevpriv != NULL as an indicator for the necessity to enter
the cdevpriv code, in particular, locking the cdevpriv_mtx, that would
otherwise needed to be entered at each last close. I think that one
pointer for the struct file is not too big cost, do you agree ?
No, it's not a big cost, however if it is possible to avoid that is best.

Can you not check the type before checking f_cdevpriv? Should we not only
be checking cdevpriv in contexts where we know that it is not a vnode?
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Of course, I forgot to answer the question. Yes, the KPI is supposed to
be used by the drivers only. Please, note that device open files have
DTYPE_VNODE.
Kostik Belousov
2008-05-13 09:37:34 UTC
Permalink
Post by Jeff Roberson
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some
time
ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Hi Kostik,
Are these per-instances structures intended to be used by anything other
than devices? If not can we make them a union with the DTYPE_VNODE
fields to save space?
Thanks,
Jeff
The current version of the patch is at
http://people.freebsd.org/~kib/misc/fdpriv.3.patch
Per insistence of John Baldwin and request of Eric Anholt, the destructors
are called now when either file is last closed, or the device is destroyed.
This versions adds only one pointer to the struct file.
Jeff, would you, please, explicitely specify what field you propose to
union with the f_cdevpriv ?
f_nextoff and f_seqcount are only used if vn_read() and vn_write() are
used. They do not apply to any other descriptors.
I use the f_cdevpriv != NULL as an indicator for the necessity to enter
the cdevpriv code, in particular, locking the cdevpriv_mtx, that would
otherwise needed to be entered at each last close. I think that one
pointer for the struct file is not too big cost, do you agree ?
No, it's not a big cost, however if it is possible to avoid that is best.
Can you not check the type before checking f_cdevpriv? Should we not only
be checking cdevpriv in contexts where we know that it is not a vnode?
I am sorry, my english may be not enough, so I may interpret your
proposal mistakenly. I read it as a suggestion to check the file type
before accessing the f_cdevpriv.

The problem with the f_cdevpriv exists only at the _fdrop(). There, we
have a file of f_type == DTYPE_VNODE both for devfs and normal files.
I cannot check the f_vnode since the vnode may be reclaimed. The only
differentiator is the f_ops, that is devfs_ops_f for devfs file, and
vnops for the normal file during the file lifetime. Unfortunately,
f_ops is reset to the badfileops by vn_closefile before the _fdrop() is
getting called.

Reserving the flag in the f_flag looks not good due to interaction with
the userspace.

I do not want the callback to be called before the d_close() driver method
gets a chance to clean the data.
Post by Jeff Roberson
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Of course, I forgot to answer the question. Yes, the KPI is supposed to
be used by the drivers only. Please, note that device open files have
DTYPE_VNODE.
Kostik Belousov
2008-05-14 09:32:23 UTC
Permalink
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some
time
ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Hi Kostik,
Are these per-instances structures intended to be used by anything other
than devices? If not can we make them a union with the DTYPE_VNODE
fields to save space?
Thanks,
Jeff
The current version of the patch is at
http://people.freebsd.org/~kib/misc/fdpriv.3.patch
Per insistence of John Baldwin and request of Eric Anholt, the destructors
are called now when either file is last closed, or the device is destroyed.
This versions adds only one pointer to the struct file.
Jeff, would you, please, explicitely specify what field you propose to
union with the f_cdevpriv ?
f_nextoff and f_seqcount are only used if vn_read() and vn_write() are
used. They do not apply to any other descriptors.
I use the f_cdevpriv != NULL as an indicator for the necessity to enter
the cdevpriv code, in particular, locking the cdevpriv_mtx, that would
otherwise needed to be entered at each last close. I think that one
pointer for the struct file is not too big cost, do you agree ?
No, it's not a big cost, however if it is possible to avoid that is best.
Can you not check the type before checking f_cdevpriv? Should we not only
be checking cdevpriv in contexts where we know that it is not a vnode?
I am sorry, my english may be not enough, so I may interpret your
proposal mistakenly. I read it as a suggestion to check the file type
before accessing the f_cdevpriv.
The problem with the f_cdevpriv exists only at the _fdrop(). There, we
have a file of f_type == DTYPE_VNODE both for devfs and normal files.
I cannot check the f_vnode since the vnode may be reclaimed. The only
differentiator is the f_ops, that is devfs_ops_f for devfs file, and
vnops for the normal file during the file lifetime. Unfortunately,
f_ops is reset to the badfileops by vn_closefile before the _fdrop() is
getting called.
Reserving the flag in the f_flag looks not good due to interaction with
the userspace.
I do not want the callback to be called before the d_close() driver method
gets a chance to clean the data.
So, I cannot implement overlay of the cdevpriv over the seqaccess data
without some additional flag. On the other hand, use of 2 bytes for the
f_type is overkill when the range of the types is [1,8]. I divided it
to two one-byte fields, and new field is used for filetype-specific
flags.

http://people.freebsd.org/~kib/misc/fdpriv.4.patch
shall give the same size of the struct file while not causing unconditional
acquire of the cdevpriv_mtx on the last file close.

Jeff, do you agree with the proposed vivisection of the f_type ? I will
ask Peter to retest the patch then.
John Baldwin
2008-05-15 11:30:07 UTC
Permalink
Post by Kostik Belousov
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Post by Jeff Roberson
Post by Kostik Belousov
Since the review for the clone-at-open patch (fdclone) posted some
time
ago
mostly says that it would be better to implement per-file private data
instead, I produced the patch along this line,
The patch does not change the cdevsw ABI, instead, three new functions
nt devfs_get_cdevpriv(void **datap);
int devfs_set_cdevpriv(void *priv, cdevpriv_dtr_t dtr);
void devfs_clear_cdevpriv(void);
are provided for manipulation of the per-file private data.
devfs_set_cdevpriv assigns the priv as private data for the file
descriptor
which is used to initiate currently performed driver operation. dtr
is the function that will be called when either the last refernce to
the file goes away or devfs_clear_cdevpriv is called.
devfs_get_cdevpriv is the obvious accessor.
devfs_clear_cdevpriv allows to clear the private data for the still
open file.
The synchronization of the cdev data and file private data is left
to the driver code, I did not found any generic helper mechanism that
could be useful there.
http://people.freebsd.org/~kib/misc/fdpriv.1.patch
http://people.freebsd.org/~kib/misc/fpclone.c
Previous version of the patch was tested by Peter Holm.
Hi Kostik,
Are these per-instances structures intended to be used by anything other
than devices? If not can we make them a union with the DTYPE_VNODE
fields to save space?
Thanks,
Jeff
The current version of the patch is at
http://people.freebsd.org/~kib/misc/fdpriv.3.patch
Per insistence of John Baldwin and request of Eric Anholt, the destructors
are called now when either file is last closed, or the device is destroyed.
This versions adds only one pointer to the struct file.
Jeff, would you, please, explicitely specify what field you propose to
union with the f_cdevpriv ?
f_nextoff and f_seqcount are only used if vn_read() and vn_write() are
used. They do not apply to any other descriptors.
I use the f_cdevpriv != NULL as an indicator for the necessity to enter
the cdevpriv code, in particular, locking the cdevpriv_mtx, that would
otherwise needed to be entered at each last close. I think that one
pointer for the struct file is not too big cost, do you agree ?
No, it's not a big cost, however if it is possible to avoid that is best.
Can you not check the type before checking f_cdevpriv? Should we not only
be checking cdevpriv in contexts where we know that it is not a vnode?
I am sorry, my english may be not enough, so I may interpret your
proposal mistakenly. I read it as a suggestion to check the file type
before accessing the f_cdevpriv.
The problem with the f_cdevpriv exists only at the _fdrop(). There, we
have a file of f_type == DTYPE_VNODE both for devfs and normal files.
I cannot check the f_vnode since the vnode may be reclaimed. The only
differentiator is the f_ops, that is devfs_ops_f for devfs file, and
vnops for the normal file during the file lifetime. Unfortunately,
f_ops is reset to the badfileops by vn_closefile before the _fdrop() is
getting called.
Reserving the flag in the f_flag looks not good due to interaction with
the userspace.
I do not want the callback to be called before the d_close() driver method
gets a chance to clean the data.
So, I cannot implement overlay of the cdevpriv over the seqaccess data
without some additional flag. On the other hand, use of 2 bytes for the
f_type is overkill when the range of the types is [1,8]. I divided it
to two one-byte fields, and new field is used for filetype-specific
flags.
http://people.freebsd.org/~kib/misc/fdpriv.4.patch
shall give the same size of the struct file while not causing unconditional
acquire of the cdevpriv_mtx on the last file close.
Jeff, do you agree with the proposed vivisection of the f_type ? I will
ask Peter to retest the patch then.
I would rather add the new pointer to struct file and avoid breaking the ABI
of fstat. That is, I would like this patch to be MFCable, but you can't MFC
this w/o breaking the ABI of struct file since fstat(1) will be reading the
full short to get the DTYPE.
--
John Baldwin
Kostik Belousov
2008-05-15 12:30:55 UTC
Permalink
Post by John Baldwin
Post by Kostik Belousov
So, I cannot implement overlay of the cdevpriv over the seqaccess data
without some additional flag. On the other hand, use of 2 bytes for the
f_type is overkill when the range of the types is [1,8]. I divided it
to two one-byte fields, and new field is used for filetype-specific
flags.
http://people.freebsd.org/~kib/misc/fdpriv.4.patch
shall give the same size of the struct file while not causing unconditional
acquire of the cdevpriv_mtx on the last file close.
Jeff, do you agree with the proposed vivisection of the f_type ? I will
ask Peter to retest the patch then.
I would rather add the new pointer to struct file and avoid breaking the ABI
of fstat. That is, I would like this patch to be MFCable, but you can't MFC
this w/o breaking the ABI of struct file since fstat(1) will be reading the
full short to get the DTYPE.
Oh, I thought that fstat uses struct xfile, at least for live systems.

I considered to introduce new f_type value for devfs files; the
DTYPE_VNODE is not a complete truth due to custom f_ops. But DTYPE_VNODE
is special-cased in enough locations to make this much less preferrable
then another pointer.
John Baldwin
2008-05-19 14:15:34 UTC
Permalink
Post by Kostik Belousov
Post by John Baldwin
Post by Kostik Belousov
So, I cannot implement overlay of the cdevpriv over the seqaccess data
without some additional flag. On the other hand, use of 2 bytes for the
f_type is overkill when the range of the types is [1,8]. I divided it
to two one-byte fields, and new field is used for filetype-specific
flags.
http://people.freebsd.org/~kib/misc/fdpriv.4.patch
shall give the same size of the struct file while not causing
unconditional acquire of the cdevpriv_mtx on the last file close.
Jeff, do you agree with the proposed vivisection of the f_type ? I will
ask Peter to retest the patch then.
I would rather add the new pointer to struct file and avoid breaking the
ABI of fstat. That is, I would like this patch to be MFCable, but you
can't MFC this w/o breaking the ABI of struct file since fstat(1) will be
reading the full short to get the DTYPE.
Oh, I thought that fstat uses struct xfile, at least for live systems.
I considered to introduce new f_type value for devfs files; the
DTYPE_VNODE is not a complete truth due to custom f_ops. But DTYPE_VNODE
is special-cased in enough locations to make this much less preferrable
then another pointer.
fstat always uses libkvm and direct access unfortunately. Also, I greatly
care about fstat on crash dumps as well. :)
--
John Baldwin
Kostik Belousov
2008-05-19 15:38:11 UTC
Permalink
Post by John Baldwin
Post by Kostik Belousov
Post by John Baldwin
Post by Kostik Belousov
So, I cannot implement overlay of the cdevpriv over the seqaccess data
without some additional flag. On the other hand, use of 2 bytes for the
f_type is overkill when the range of the types is [1,8]. I divided it
to two one-byte fields, and new field is used for filetype-specific
flags.
http://people.freebsd.org/~kib/misc/fdpriv.4.patch
shall give the same size of the struct file while not causing
unconditional acquire of the cdevpriv_mtx on the last file close.
Jeff, do you agree with the proposed vivisection of the f_type ? I will
ask Peter to retest the patch then.
I would rather add the new pointer to struct file and avoid breaking the
ABI of fstat. That is, I would like this patch to be MFCable, but you
can't MFC this w/o breaking the ABI of struct file since fstat(1) will be
reading the full short to get the DTYPE.
Oh, I thought that fstat uses struct xfile, at least for live systems.
I considered to introduce new f_type value for devfs files; the
DTYPE_VNODE is not a complete truth due to custom f_ops. But DTYPE_VNODE
is special-cased in enough locations to make this much less preferrable
then another pointer.
fstat always uses libkvm and direct access unfortunately. Also, I greatly
care about fstat on crash dumps as well. :)
So, what is the conclusion ? Does anybody have any further objections
against committing
http://people.freebsd.org/~kib/misc/fdpriv.3.patch

I plan to do it this week.
John Baldwin
2008-05-19 16:09:34 UTC
Permalink
Post by Kostik Belousov
Post by John Baldwin
Post by Kostik Belousov
Post by John Baldwin
Post by Kostik Belousov
So, I cannot implement overlay of the cdevpriv over the seqaccess data
without some additional flag. On the other hand, use of 2 bytes for the
f_type is overkill when the range of the types is [1,8]. I divided it
to two one-byte fields, and new field is used for filetype-specific
flags.
http://people.freebsd.org/~kib/misc/fdpriv.4.patch
shall give the same size of the struct file while not causing
unconditional acquire of the cdevpriv_mtx on the last file close.
Jeff, do you agree with the proposed vivisection of the f_type ? I will
ask Peter to retest the patch then.
I would rather add the new pointer to struct file and avoid breaking the
ABI of fstat. That is, I would like this patch to be MFCable, but you
can't MFC this w/o breaking the ABI of struct file since fstat(1) will be
reading the full short to get the DTYPE.
Oh, I thought that fstat uses struct xfile, at least for live systems.
I considered to introduce new f_type value for devfs files; the
DTYPE_VNODE is not a complete truth due to custom f_ops. But DTYPE_VNODE
is special-cased in enough locations to make this much less preferrable
then another pointer.
fstat always uses libkvm and direct access unfortunately. Also, I greatly
care about fstat on crash dumps as well. :)
So, what is the conclusion ? Does anybody have any further objections
against committing
http://people.freebsd.org/~kib/misc/fdpriv.3.patch
I plan to do it this week.
I think the current patch is good.
--
John Baldwin
Loading...