inputs. This framework was never widely used, and most attempts to
use it were broken. Drivers should instead be exposing domain-specific
interfaces either to kernel or to userspace.
-Who: Pavel Machek <pavel@suse.cz>
+Who: Pavel Machek <pavel@ucw.cz>
---------------------------
---------------------------
-What: PCMCIA control ioctl (needed for pcmcia-cs [cardmgr, cardctl])
-When: 2.6.35/2.6.36
-Files: drivers/pcmcia/: pcmcia_ioctl.c
-Why: With the 16-bit PCMCIA subsystem now behaving (almost) like a
- normal hotpluggable bus, and with it using the default kernel
- infrastructure (hotplug, driver core, sysfs) keeping the PCMCIA
- control ioctl needed by cardmgr and cardctl from pcmcia-cs is
- unnecessary and potentially harmful (it does not provide for
- proper locking), and makes further cleanups and integration of the
- PCMCIA subsystem into the Linux kernel device driver model more
- difficult. The features provided by cardmgr and cardctl are either
- handled by the kernel itself now or are available in the new
- pcmciautils package available at
- http://kernel.org/pub/linux/utils/kernel/pcmcia/
-
- For all architectures except ARM, the associated config symbol
- has been removed from kernel 2.6.34; for ARM, it will be likely
- be removed from kernel 2.6.35. The actual code will then likely
- be removed from kernel 2.6.36.
-Who: Dominik Brodowski <linux@dominikbrodowski.net>
-
----------------------------
-
What: sys_sysctl
When: September 2010
Option: CONFIG_SYSCTL_SYSCALL
---------------------------
+What: /proc/<pid>/oom_adj
+When: August 2012
+Why: /proc/<pid>/oom_adj allows userspace to influence the oom killer's
+ badness heuristic used to determine which task to kill when the kernel
+ is out of memory.
+
+ The badness heuristic has since been rewritten since the introduction of
+ this tunable such that its meaning is deprecated. The value was
+ implemented as a bitshift on a score generated by the badness()
+ function that did not have any precise units of measure. With the
+ rewrite, the score is given as a proportion of available memory to the
+ task allocating pages, so using a bitshift which grows the score
+ exponentially is, thus, impossible to tune with fine granularity.
+
+ A much more powerful interface, /proc/<pid>/oom_score_adj, was
+ introduced with the oom killer rewrite that allows users to increase or
+ decrease the badness() score linearly. This interface will replace
+ /proc/<pid>/oom_adj.
+
+ A warning will be emitted to the kernel log if an application uses this
+ deprecated interface. After it is printed once, future warnings will be
+ suppressed until the kernel is rebooted.
+
+---------------------------
+
What: remove EXPORT_SYMBOL(kernel_thread)
When: August 2006
Files: arch/*/kernel/*_ksyms.c
---------------------------
-What: CONFIG_NF_CT_ACCT
-When: 2.6.29
-Why: Accounting can now be enabled/disabled without kernel recompilation.
- Currently used only to set a default value for a feature that is also
- controlled by a kernel/module/sysfs/sysctl parameter.
-Who: Krzysztof Piotr Oledzki <ole@ans.pl>
-
----------------------------
-
What: sysfs ui for changing p4-clockmod parameters
When: September 2009
Why: See commits 129f8ae9b1b5be94517da76009ea956e89104ce8 and
Why: Should be implemented in userspace, policy daemon.
Who: Johannes Berg <johannes@sipsolutions.net>
- ---------------------------
-
- What: CONFIG_INOTIFY
- When: 2.6.33
- Why: last user (audit) will be converted to the newer more generic
- and more easily maintained fsnotify subsystem
- Who: Eric Paris <eparis@redhat.com>
-
----------------------------
-What: lock_policy_rwsem_* and unlock_policy_rwsem_* will not be
- exported interface anymore.
-When: 2.6.33
-Why: cpu_policy_rwsem has a new cleaner definition making it local to
- cpufreq core and contained inside cpufreq.c. Other dependent
- drivers should not use it in order to safely avoid lockdep issues.
-Who: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
-
-----------------------------
-
What: sound-slot/service-* module aliases and related clutters in
sound/sound_core.c
When: August 2010
----------------------------
-What: usbvideo quickcam_messenger driver
-When: 2.6.35
-Files: drivers/media/video/usbvideo/quickcam_messenger.[ch]
-Why: obsolete v4l1 driver replaced by gspca_stv06xx
-Who: Hans de Goede <hdegoede@redhat.com>
-
-----------------------------
-
-What: ov511 v4l1 driver
-When: 2.6.35
-Files: drivers/media/video/ov511.[ch]
-Why: obsolete v4l1 driver replaced by gspca_ov519
-Who: Hans de Goede <hdegoede@redhat.com>
-
-----------------------------
-
-What: w9968cf v4l1 driver
-When: 2.6.35
-Files: drivers/media/video/w9968cf*.[ch]
-Why: obsolete v4l1 driver replaced by gspca_ov519
-Who: Hans de Goede <hdegoede@redhat.com>
-
-----------------------------
-
-What: ovcamchip sensor framework
-When: 2.6.35
-Files: drivers/media/video/ovcamchip/*
-Why: Only used by obsoleted v4l1 drivers
-Who: Hans de Goede <hdegoede@redhat.com>
-
-----------------------------
-
-What: stv680 v4l1 driver
-When: 2.6.35
-Files: drivers/media/video/stv680.[ch]
-Why: obsolete v4l1 driver replaced by gspca_stv0680
-Who: Hans de Goede <hdegoede@redhat.com>
-
-----------------------------
-
-What: zc0301 v4l driver
-When: 2.6.35
-Files: drivers/media/video/zc0301/*
-Why: Duplicate functionality with the gspca_zc3xx driver, zc0301 only
- supports 2 USB-ID's (because it only supports a limited set of
- sensors) wich are also supported by the gspca_zc3xx driver
- (which supports 53 USB-ID's in total)
-Who: Hans de Goede <hdegoede@redhat.com>
-
-----------------------------
-
What: sysfs-class-rfkill state file
When: Feb 2014
Files: net/rfkill/core.c
----------------------------
-What: KVM memory aliases support
-When: July 2010
-Why: Memory aliasing support is used for speeding up guest vga access
- through the vga windows.
-
- Modern userspace no longer uses this feature, so it's just bitrotted
- code and can be removed with no impact.
-Who: Avi Kivity <avi@redhat.com>
-
-----------------------------
-
-What: xtime, wall_to_monotonic
-When: 2.6.36+
-Files: kernel/time/timekeeping.c include/linux/time.h
-Why: Cleaning up timekeeping internal values. Please use
- existing timekeeping accessor functions to access
- the equivalent functionality.
-Who: John Stultz <johnstul@us.ibm.com>
-
-----------------------------
-
-What: KVM kernel-allocated memory slots
-When: July 2010
-Why: Since 2.6.25, kvm supports user-allocated memory slots, which are
- much more flexible than kernel-allocated slots. All current userspace
- supports the newer interface and this code can be removed with no
- impact.
-Who: Avi Kivity <avi@redhat.com>
-
-----------------------------
-
What: KVM paravirt mmu host support
When: January 2011
Why: The paravirt mmu host support is slower than non-paravirt mmu, both
* Copyright (C) 1997-2000 Jakub Jelinek (jakub@redhat.com)
* Copyright (C) 1998 Eddie C. Dost (ecd@skynet.be)
* Copyright (C) 2001,2002 Andi Kleen, SuSE Labs
- * Copyright (C) 2003 Pavel Machek (pavel@suse.cz)
+ * Copyright (C) 2003 Pavel Machek (pavel@ucw.cz)
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
+#include <linux/stddef.h>
#include <linux/kernel.h>
#include <linux/linkage.h>
#include <linux/compat.h>
error = user_path(pathname, &path);
if (!error) {
struct kstatfs tmp;
- error = vfs_statfs(path.dentry, &tmp);
+ error = vfs_statfs(&path, &tmp);
if (!error)
error = put_compat_statfs(buf, &tmp);
path_put(&path);
file = fget(fd);
if (!file)
goto out;
- error = vfs_statfs(file->f_path.dentry, &tmp);
+ error = vfs_statfs(&file->f_path, &tmp);
if (!error)
error = put_compat_statfs(buf, &tmp);
fput(file);
error = user_path(pathname, &path);
if (!error) {
struct kstatfs tmp;
- error = vfs_statfs(path.dentry, &tmp);
+ error = vfs_statfs(&path, &tmp);
if (!error)
error = put_compat_statfs64(buf, &tmp);
path_put(&path);
file = fget(fd);
if (!file)
goto out;
- error = vfs_statfs(file->f_path.dentry, &tmp);
+ error = vfs_statfs(&file->f_path, &tmp);
if (!error)
error = put_compat_statfs64(buf, &tmp);
fput(file);
sb = user_get_super(new_decode_dev(dev));
if (!sb)
return -EINVAL;
- err = vfs_statfs(sb->s_root, &sbuf);
+ err = statfs_by_dentry(sb->s_root, &sbuf);
drop_super(sb);
if (err)
return err;
return retval;
}
-#define NAME_OFFSET(de) ((int) ((de)->d_name - (char __user *) (de)))
-
struct compat_old_linux_dirent {
compat_ulong_t d_ino;
compat_ulong_t d_offset;
struct compat_linux_dirent __user * dirent;
struct compat_getdents_callback *buf = __buf;
compat_ulong_t d_ino;
- int reclen = ALIGN(NAME_OFFSET(dirent) + namlen + 2, sizeof(compat_long_t));
+ int reclen = ALIGN(offsetof(struct compat_linux_dirent, d_name) +
+ namlen + 2, sizeof(compat_long_t));
buf->error = -EINVAL; /* only used if we fail.. */
if (reclen > buf->count)
{
struct linux_dirent64 __user *dirent;
struct compat_getdents_callback64 *buf = __buf;
- int jj = NAME_OFFSET(dirent);
- int reclen = ALIGN(jj + namlen + 1, sizeof(u64));
+ int reclen = ALIGN(offsetof(struct linux_dirent64, d_name) + namlen + 1,
+ sizeof(u64));
u64 off;
buf->error = -EINVAL; /* only used if we fail.. */
if (iov != iovstack)
kfree(iov);
if ((ret + (type == READ)) > 0) {
- struct dentry *dentry = file->f_path.dentry;
if (type == READ)
- fsnotify_access(dentry);
+ fsnotify_access(file);
else
- fsnotify_modify(dentry);
+ fsnotify_modify(file);
}
return ret;
}
#include <linux/mm.h>
#include <linux/stat.h>
#include <linux/fcntl.h>
-#include <linux/smp_lock.h>
#include <linux/swap.h>
#include <linux/string.h>
#include <linux/init.h>
if (file->f_path.mnt->mnt_flags & MNT_NOEXEC)
goto exit;
- fsnotify_open(file->f_path.dentry);
+ fsnotify_open(file);
error = -ENOEXEC;
if(file->f_op) {
else
stack_base = vma->vm_start - stack_expand;
#endif
+ current->mm->start_stack = bprm->p;
ret = expand_stack(vma, stack_base);
if (ret)
ret = -EFAULT;
if (file->f_path.mnt->mnt_flags & MNT_NOEXEC)
goto exit;
- fsnotify_open(file->f_path.dentry);
+ fsnotify_open(file);
err = deny_write_access(file);
if (err)
*/
clear_thread_flag(TIF_SIGPENDING);
- /*
- * lock_kernel() because format_corename() is controlled by sysctl, which
- * uses lock_kernel()
- */
- lock_kernel();
ispipe = format_corename(corename, signr);
- unlock_kernel();
if (ispipe) {
int dump_count;
#include <linux/pagemap.h>
#include <linux/cdev.h>
#include <linux/bootmem.h>
- #include <linux/inotify.h>
#include <linux/fsnotify.h>
#include <linux/mount.h>
#include <linux/async.h>
INIT_RAW_PRIO_TREE_ROOT(&inode->i_data.i_mmap);
INIT_LIST_HEAD(&inode->i_data.i_mmap_nonlinear);
i_size_ordered_init(inode);
- #ifdef CONFIG_INOTIFY
- INIT_LIST_HEAD(&inode->inotify_watches);
- mutex_init(&inode->inotify_mutex);
- #endif
#ifdef CONFIG_FSNOTIFY
- INIT_HLIST_HEAD(&inode->i_fsnotify_mark_entries);
+ INIT_HLIST_HEAD(&inode->i_fsnotify_marks);
#endif
}
EXPORT_SYMBOL(inode_init_once);
inodes_stat.nr_unused--;
}
-/**
- * clear_inode - clear an inode
- * @inode: inode to clear
- *
- * This is called by the filesystem to tell us
- * that the inode is no longer useful. We just
- * terminate it with extreme prejudice.
- */
-void clear_inode(struct inode *inode)
+void end_writeback(struct inode *inode)
{
might_sleep();
- invalidate_inode_buffers(inode);
-
BUG_ON(inode->i_data.nrpages);
+ BUG_ON(!list_empty(&inode->i_data.private_list));
BUG_ON(!(inode->i_state & I_FREEING));
BUG_ON(inode->i_state & I_CLEAR);
inode_sync_wait(inode);
- if (inode->i_sb->s_op->clear_inode)
- inode->i_sb->s_op->clear_inode(inode);
+ inode->i_state = I_FREEING | I_CLEAR;
+}
+EXPORT_SYMBOL(end_writeback);
+
+static void evict(struct inode *inode)
+{
+ const struct super_operations *op = inode->i_sb->s_op;
+
+ if (op->evict_inode) {
+ op->evict_inode(inode);
+ } else {
+ if (inode->i_data.nrpages)
+ truncate_inode_pages(&inode->i_data, 0);
+ end_writeback(inode);
+ }
if (S_ISBLK(inode->i_mode) && inode->i_bdev)
bd_forget(inode);
if (S_ISCHR(inode->i_mode) && inode->i_cdev)
cd_forget(inode);
- inode->i_state = I_CLEAR;
}
-EXPORT_SYMBOL(clear_inode);
/*
* dispose_list - dispose of the contents of a local list
inode = list_first_entry(head, struct inode, i_list);
list_del(&inode->i_list);
- if (inode->i_data.nrpages)
- truncate_inode_pages(&inode->i_data, 0);
- clear_inode(inode);
+ evict(inode);
spin_lock(&inode_lock);
hlist_del_init(&inode->i_hash);
down_write(&iprune_sem);
spin_lock(&inode_lock);
- inotify_unmount_inodes(&sb->s_inodes);
fsnotify_unmount_inodes(&sb->s_inodes);
busy = invalidate_list(&sb->s_inodes, &throw_away);
spin_unlock(&inode_lock);
continue;
if (!test(inode, data))
continue;
- if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
+ if (inode->i_state & (I_FREEING|I_WILL_FREE)) {
__wait_on_freeing_inode(inode);
goto repeat;
}
continue;
if (inode->i_sb != sb)
continue;
- if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)) {
+ if (inode->i_state & (I_FREEING|I_WILL_FREE)) {
__wait_on_freeing_inode(inode);
goto repeat;
}
struct inode *igrab(struct inode *inode)
{
spin_lock(&inode_lock);
- if (!(inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE)))
+ if (!(inode->i_state & (I_FREEING|I_WILL_FREE)))
__iget(inode);
else
/*
continue;
if (old->i_sb != sb)
continue;
- if (old->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE))
+ if (old->i_state & (I_FREEING|I_WILL_FREE))
continue;
break;
}
continue;
if (!test(old, data))
continue;
- if (old->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE))
+ if (old->i_state & (I_FREEING|I_WILL_FREE))
continue;
break;
}
}
EXPORT_SYMBOL(remove_inode_hash);
+int generic_delete_inode(struct inode *inode)
+{
+ return 1;
+}
+EXPORT_SYMBOL(generic_delete_inode);
+
/*
- * Tell the filesystem that this inode is no longer of any interest and should
- * be completely destroyed.
- *
- * We leave the inode in the inode hash table until *after* the filesystem's
- * ->delete_inode completes. This ensures that an iget (such as nfsd might
- * instigate) will always find up-to-date information either in the hash or on
- * disk.
- *
- * I_FREEING is set so that no-one will take a new reference to the inode while
- * it is being deleted.
+ * Normal UNIX filesystem behaviour: delete the
+ * inode when the usage count drops to zero, and
+ * i_nlink is zero.
*/
-void generic_delete_inode(struct inode *inode)
+int generic_drop_inode(struct inode *inode)
{
- const struct super_operations *op = inode->i_sb->s_op;
-
- list_del_init(&inode->i_list);
- list_del_init(&inode->i_sb_list);
- WARN_ON(inode->i_state & I_NEW);
- inode->i_state |= I_FREEING;
- inodes_stat.nr_inodes--;
- spin_unlock(&inode_lock);
-
- if (op->delete_inode) {
- void (*delete)(struct inode *) = op->delete_inode;
- /* Filesystems implementing their own
- * s_op->delete_inode are required to call
- * truncate_inode_pages and clear_inode()
- * internally */
- delete(inode);
- } else {
- truncate_inode_pages(&inode->i_data, 0);
- clear_inode(inode);
- }
- spin_lock(&inode_lock);
- hlist_del_init(&inode->i_hash);
- spin_unlock(&inode_lock);
- wake_up_inode(inode);
- BUG_ON(inode->i_state != I_CLEAR);
- destroy_inode(inode);
+ return !inode->i_nlink || hlist_unhashed(&inode->i_hash);
}
-EXPORT_SYMBOL(generic_delete_inode);
+EXPORT_SYMBOL_GPL(generic_drop_inode);
-/**
- * generic_detach_inode - remove inode from inode lists
- * @inode: inode to remove
- *
- * Remove inode from inode lists, write it if it's dirty. This is just an
- * internal VFS helper exported for hugetlbfs. Do not use!
+/*
+ * Called when we're dropping the last reference
+ * to an inode.
*
- * Returns 1 if inode should be completely destroyed.
+ * Call the FS "drop_inode()" function, defaulting to
+ * the legacy UNIX filesystem behaviour. If it tells
+ * us to evict inode, do so. Otherwise, retain inode
+ * in cache if fs is alive, sync and evict if fs is
+ * shutting down.
*/
-int generic_detach_inode(struct inode *inode)
+static void iput_final(struct inode *inode)
{
struct super_block *sb = inode->i_sb;
+ const struct super_operations *op = inode->i_sb->s_op;
+ int drop;
- if (!hlist_unhashed(&inode->i_hash)) {
+ if (op && op->drop_inode)
+ drop = op->drop_inode(inode);
+ else
+ drop = generic_drop_inode(inode);
+
+ if (!drop) {
if (!(inode->i_state & (I_DIRTY|I_SYNC)))
list_move(&inode->i_list, &inode_unused);
inodes_stat.nr_unused++;
if (sb->s_flags & MS_ACTIVE) {
spin_unlock(&inode_lock);
- return 0;
+ return;
}
WARN_ON(inode->i_state & I_NEW);
inode->i_state |= I_WILL_FREE;
inode->i_state |= I_FREEING;
inodes_stat.nr_inodes--;
spin_unlock(&inode_lock);
- return 1;
-}
-EXPORT_SYMBOL_GPL(generic_detach_inode);
-
-static void generic_forget_inode(struct inode *inode)
-{
- if (!generic_detach_inode(inode))
- return;
- if (inode->i_data.nrpages)
- truncate_inode_pages(&inode->i_data, 0);
- clear_inode(inode);
+ evict(inode);
+ spin_lock(&inode_lock);
+ hlist_del_init(&inode->i_hash);
+ spin_unlock(&inode_lock);
wake_up_inode(inode);
+ BUG_ON(inode->i_state != (I_FREEING | I_CLEAR));
destroy_inode(inode);
}
-/*
- * Normal UNIX filesystem behaviour: delete the
- * inode when the usage count drops to zero, and
- * i_nlink is zero.
- */
-void generic_drop_inode(struct inode *inode)
-{
- if (!inode->i_nlink)
- generic_delete_inode(inode);
- else
- generic_forget_inode(inode);
-}
-EXPORT_SYMBOL_GPL(generic_drop_inode);
-
-/*
- * Called when we're dropping the last reference
- * to an inode.
- *
- * Call the FS "drop()" function, defaulting to
- * the legacy UNIX filesystem behaviour..
- *
- * NOTE! NOTE! NOTE! We're called with the inode lock
- * held, and the drop function is supposed to release
- * the lock!
- */
-static inline void iput_final(struct inode *inode)
-{
- const struct super_operations *op = inode->i_sb->s_op;
- void (*drop)(struct inode *) = generic_drop_inode;
-
- if (op && op->drop_inode)
- drop = op->drop_inode;
- drop(inode);
-}
-
/**
* iput - put an inode
* @inode: inode to put
void iput(struct inode *inode)
{
if (inode) {
- BUG_ON(inode->i_state == I_CLEAR);
+ BUG_ON(inode->i_state & I_CLEAR);
if (atomic_dec_and_lock(&inode->i_count, &inode_lock))
iput_final(inode);
if (retval)
return retval;
- return security_inode_permission(inode,
- mask & (MAY_READ|MAY_WRITE|MAY_EXEC|MAY_APPEND));
+ return security_inode_permission(inode, mask);
}
/**
*/
error = locks_verify_locked(inode);
if (!error)
- error = security_path_truncate(path, 0,
- ATTR_MTIME|ATTR_CTIME|ATTR_OPEN);
+ error = security_path_truncate(path);
if (!error) {
error = do_truncate(path->dentry, 0,
ATTR_MTIME|ATTR_CTIME|ATTR_OPEN,
{
int error;
int is_dir = S_ISDIR(old_dentry->d_inode->i_mode);
- const char *old_name;
+ const unsigned char *old_name;
if (old_dentry->d_inode == new_dentry->d_inode)
return 0;
#include <linux/log2.h>
#include <linux/idr.h>
#include <linux/fs_struct.h>
+ #include <linux/fsnotify.h>
#include <asm/uaccess.h>
#include <asm/unistd.h>
#include "pnode.h"
INIT_LIST_HEAD(&mnt->mnt_share);
INIT_LIST_HEAD(&mnt->mnt_slave_list);
INIT_LIST_HEAD(&mnt->mnt_slave);
+ #ifdef CONFIG_FSNOTIFY
+ INIT_HLIST_HEAD(&mnt->mnt_fsnotify_marks);
+ #endif
#ifdef CONFIG_SMP
mnt->mnt_writers = alloc_percpu(int);
if (!mnt->mnt_writers)
* provides barriers, so count_mnt_writers() below is safe. AV
*/
WARN_ON(count_mnt_writers(mnt));
+ fsnotify_vfsmount_delete(mnt);
dput(mnt->mnt_root);
free_vfsmnt(mnt);
deactivate_super(sb);
if (flags & MS_RDONLY)
mnt_flags |= MNT_READONLY;
- flags &= ~(MS_NOSUID | MS_NOEXEC | MS_NODEV | MS_ACTIVE |
+ flags &= ~(MS_NOSUID | MS_NOEXEC | MS_NODEV | MS_ACTIVE | MS_BORN |
MS_NOATIME | MS_NODIRATIME | MS_RELATIME| MS_KERNMOUNT |
MS_STRICTATIME);
return error;
}
-#endif /* defined(CONFIG_NFS_V4) */
+#endif /* defined(CONFIG_NFSD_V4) */
#ifdef CONFIG_NFSD_V3
/*
loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
{
struct inode *inode;
- struct raparms *ra;
mm_segment_t oldfs;
__be32 err;
int host_err;
if (svc_msnfs(fhp) && !lock_may_read(inode, offset, *count))
goto out;
- /* Get readahead parameters */
- ra = nfsd_get_raparms(inode->i_sb->s_dev, inode->i_ino);
-
- if (ra && ra->p_set)
- file->f_ra = ra->p_ra;
-
if (file->f_op->splice_read && rqstp->rq_splice_ok) {
struct splice_desc sd = {
.len = 0,
set_fs(oldfs);
}
- /* Write back readahead params */
- if (ra) {
- struct raparm_hbucket *rab = &raparm_hash[ra->p_hindex];
- spin_lock(&rab->pb_lock);
- ra->p_ra = file->f_ra;
- ra->p_set = 1;
- ra->p_count--;
- spin_unlock(&rab->pb_lock);
- }
-
if (host_err >= 0) {
nfsdstats.io_read += host_err;
*count = host_err;
err = 0;
- fsnotify_access(file->f_path.dentry);
+ fsnotify_access(file);
} else
err = nfserrno(host_err);
out:
goto out_nfserr;
*cnt = host_err;
nfsdstats.io_write += host_err;
- fsnotify_modify(file->f_path.dentry);
+ fsnotify_modify(file);
/* clear setuid/setgid flag after write */
if (inode->i_mode & (S_ISUID | S_ISGID))
* on entry. On return, *count contains the number of bytes actually read.
* N.B. After this call fhp needs an fh_put
*/
+__be32 nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp,
+ loff_t offset, struct kvec *vec, int vlen, unsigned long *count)
+{
+ struct file *file;
+ struct inode *inode;
+ struct raparms *ra;
+ __be32 err;
+
+ err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_READ, &file);
+ if (err)
+ return err;
+
+ inode = file->f_path.dentry->d_inode;
+
+ /* Get readahead parameters */
+ ra = nfsd_get_raparms(inode->i_sb->s_dev, inode->i_ino);
+
+ if (ra && ra->p_set)
+ file->f_ra = ra->p_ra;
+
+ err = nfsd_vfs_read(rqstp, fhp, file, offset, vec, vlen, count);
+
+ /* Write back readahead params */
+ if (ra) {
+ struct raparm_hbucket *rab = &raparm_hash[ra->p_hindex];
+ spin_lock(&rab->pb_lock);
+ ra->p_ra = file->f_ra;
+ ra->p_set = 1;
+ ra->p_count--;
+ spin_unlock(&rab->pb_lock);
+ }
+
+ nfsd_close(file);
+ return err;
+}
+
+/* As above, but use the provided file descriptor. */
__be32
-nfsd_read(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
+nfsd_read_file(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
loff_t offset, struct kvec *vec, int vlen,
unsigned long *count)
{
if (err)
goto out;
err = nfsd_vfs_read(rqstp, fhp, file, offset, vec, vlen, count);
- } else {
- err = nfsd_open(rqstp, fhp, S_IFREG, NFSD_MAY_READ, &file);
- if (err)
- goto out;
- err = nfsd_vfs_read(rqstp, fhp, file, offset, vec, vlen, count);
- nfsd_close(file);
- }
+ } else /* Note file may still be NULL in NFSv4 special stateid case: */
+ err = nfsd_read(rqstp, fhp, offset, vec, vlen, count);
out:
return err;
}
char *name, int len, struct svc_fh *tfhp)
{
struct dentry *ddir, *dnew, *dold;
- struct inode *dirp, *dest;
+ struct inode *dirp;
__be32 err;
int host_err;
goto out_nfserr;
dold = tfhp->fh_dentry;
- dest = dold->d_inode;
host_err = mnt_want_write(tfhp->fh_export->ex_path.mnt);
if (host_err) {
__be32
nfsd_statfs(struct svc_rqst *rqstp, struct svc_fh *fhp, struct kstatfs *stat, int access)
{
- __be32 err = fh_verify(rqstp, fhp, 0, NFSD_MAY_NOP | access);
- if (!err && vfs_statfs(fhp->fh_dentry,stat))
+ struct path path = {
+ .mnt = fhp->fh_export->ex_path.mnt,
+ .dentry = fhp->fh_dentry,
+ };
+ __be32 err;
+
+ err = fh_verify(rqstp, fhp, 0, NFSD_MAY_NOP | access);
+ if (!err && vfs_statfs(&path, stat))
err = nfserr_io;
return err;
}
struct dentry *dentry, int acc)
{
struct inode *inode = dentry->d_inode;
- struct path path;
int err;
if (acc == NFSD_MAY_NOP)
if (err == -EACCES && S_ISREG(inode->i_mode) &&
acc == (NFSD_MAY_READ | NFSD_MAY_OWNER_OVERRIDE))
err = inode_permission(inode, MAY_EXEC);
- if (err)
- goto nfsd_out;
- /* Do integrity (permission) checking now, but defer incrementing
- * IMA counts to the actual file open.
- */
- path.mnt = exp->ex_path.mnt;
- path.dentry = dentry;
-nfsd_out:
return err? nfserrno(err) : 0;
}
* the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
*/
- /*
- * fsnotify inode mark locking/lifetime/and refcnting
- *
- * REFCNT:
- * The mark->refcnt tells how many "things" in the kernel currently are
- * referencing this object. The object typically will live inside the kernel
- * with a refcnt of 2, one for each list it is on (i_list, g_list). Any task
- * which can find this object holding the appropriete locks, can take a reference
- * and the object itself is guarenteed to survive until the reference is dropped.
- *
- * LOCKING:
- * There are 3 spinlocks involved with fsnotify inode marks and they MUST
- * be taken in order as follows:
- *
- * entry->lock
- * group->mark_lock
- * inode->i_lock
- *
- * entry->lock protects 2 things, entry->group and entry->inode. You must hold
- * that lock to dereference either of these things (they could be NULL even with
- * the lock)
- *
- * group->mark_lock protects the mark_entries list anchored inside a given group
- * and each entry is hooked via the g_list. It also sorta protects the
- * free_g_list, which when used is anchored by a private list on the stack of the
- * task which held the group->mark_lock.
- *
- * inode->i_lock protects the i_fsnotify_mark_entries list anchored inside a
- * given inode and each entry is hooked via the i_list. (and sorta the
- * free_i_list)
- *
- *
- * LIFETIME:
- * Inode marks survive between when they are added to an inode and when their
- * refcnt==0.
- *
- * The inode mark can be cleared for a number of different reasons including:
- * - The inode is unlinked for the last time. (fsnotify_inode_remove)
- * - The inode is being evicted from cache. (fsnotify_inode_delete)
- * - The fs the inode is on is unmounted. (fsnotify_inode_delete/fsnotify_unmount_inodes)
- * - Something explicitly requests that it be removed. (fsnotify_destroy_mark_by_entry)
- * - The fsnotify_group associated with the mark is going away and all such marks
- * need to be cleaned up. (fsnotify_clear_marks_by_group)
- *
- * Worst case we are given an inode and need to clean up all the marks on that
- * inode. We take i_lock and walk the i_fsnotify_mark_entries safely. For each
- * mark on the list we take a reference (so the mark can't disappear under us).
- * We remove that mark form the inode's list of marks and we add this mark to a
- * private list anchored on the stack using i_free_list; At this point we no
- * longer fear anything finding the mark using the inode's list of marks.
- *
- * We can safely and locklessly run the private list on the stack of everything
- * we just unattached from the original inode. For each mark on the private list
- * we grab the mark-> and can thus dereference mark->group and mark->inode. If
- * we see the group and inode are not NULL we take those locks. Now holding all
- * 3 locks we can completely remove the mark from other tasks finding it in the
- * future. Remember, 10 things might already be referencing this mark, but they
- * better be holding a ref. We drop our reference we took before we unhooked it
- * from the inode. When the ref hits 0 we can free the mark.
- *
- * Very similarly for freeing by group, except we use free_g_list.
- *
- * This has the very interesting property of being able to run concurrently with
- * any (or all) other directions.
- */
-
#include <linux/fs.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/fsnotify_backend.h>
#include "fsnotify.h"
- void fsnotify_get_mark(struct fsnotify_mark_entry *entry)
- {
- atomic_inc(&entry->refcnt);
- }
-
- void fsnotify_put_mark(struct fsnotify_mark_entry *entry)
- {
- if (atomic_dec_and_test(&entry->refcnt))
- entry->free_mark(entry);
- }
-
/*
* Recalculate the mask of events relevant to a given inode locked.
*/
static void fsnotify_recalc_inode_mask_locked(struct inode *inode)
{
- struct fsnotify_mark_entry *entry;
+ struct fsnotify_mark *mark;
struct hlist_node *pos;
__u32 new_mask = 0;
assert_spin_locked(&inode->i_lock);
- hlist_for_each_entry(entry, pos, &inode->i_fsnotify_mark_entries, i_list)
- new_mask |= entry->mask;
+ hlist_for_each_entry(mark, pos, &inode->i_fsnotify_marks, i.i_list)
+ new_mask |= mark->mask;
inode->i_fsnotify_mask = new_mask;
}
__fsnotify_update_child_dentry_flags(inode);
}
- /*
- * Any time a mark is getting freed we end up here.
- * The caller had better be holding a reference to this mark so we don't actually
- * do the final put under the entry->lock
- */
- void fsnotify_destroy_mark_by_entry(struct fsnotify_mark_entry *entry)
+ void fsnotify_destroy_inode_mark(struct fsnotify_mark *mark)
{
- struct fsnotify_group *group;
- struct inode *inode;
-
- spin_lock(&entry->lock);
-
- group = entry->group;
- inode = entry->inode;
-
- BUG_ON(group && !inode);
- BUG_ON(!group && inode);
-
- /* if !group something else already marked this to die */
- if (!group) {
- spin_unlock(&entry->lock);
- return;
- }
+ struct inode *inode = mark->i.inode;
- /* 1 from caller and 1 for being on i_list/g_list */
- BUG_ON(atomic_read(&entry->refcnt) < 2);
+ assert_spin_locked(&mark->lock);
+ assert_spin_locked(&mark->group->mark_lock);
- spin_lock(&group->mark_lock);
spin_lock(&inode->i_lock);
- hlist_del_init(&entry->i_list);
- entry->inode = NULL;
-
- list_del_init(&entry->g_list);
- entry->group = NULL;
-
- fsnotify_put_mark(entry); /* for i_list and g_list */
+ hlist_del_init_rcu(&mark->i.i_list);
+ mark->i.inode = NULL;
/*
- * this mark is now off the inode->i_fsnotify_mark_entries list and we
+ * this mark is now off the inode->i_fsnotify_marks list and we
* hold the inode->i_lock, so this is the perfect time to update the
* inode->i_fsnotify_mask
*/
fsnotify_recalc_inode_mask_locked(inode);
spin_unlock(&inode->i_lock);
- spin_unlock(&group->mark_lock);
- spin_unlock(&entry->lock);
-
- /*
- * Some groups like to know that marks are being freed. This is a
- * callback to the group function to let it know that this entry
- * is being freed.
- */
- if (group->ops->freeing_mark)
- group->ops->freeing_mark(entry, group);
-
- /*
- * __fsnotify_update_child_dentry_flags(inode);
- *
- * I really want to call that, but we can't, we have no idea if the inode
- * still exists the second we drop the entry->lock.
- *
- * The next time an event arrive to this inode from one of it's children
- * __fsnotify_parent will see that the inode doesn't care about it's
- * children and will update all of these flags then. So really this
- * is just a lazy update (and could be a perf win...)
- */
-
-
- iput(inode);
-
- /*
- * it's possible that this group tried to destroy itself, but this
- * this mark was simultaneously being freed by inode. If that's the
- * case, we finish freeing the group here.
- */
- if (unlikely(atomic_dec_and_test(&group->num_marks)))
- fsnotify_final_destroy_group(group);
- }
-
- /*
- * Given a group, destroy all of the marks associated with that group.
- */
- void fsnotify_clear_marks_by_group(struct fsnotify_group *group)
- {
- struct fsnotify_mark_entry *lentry, *entry;
- LIST_HEAD(free_list);
-
- spin_lock(&group->mark_lock);
- list_for_each_entry_safe(entry, lentry, &group->mark_entries, g_list) {
- list_add(&entry->free_g_list, &free_list);
- list_del_init(&entry->g_list);
- fsnotify_get_mark(entry);
- }
- spin_unlock(&group->mark_lock);
-
- list_for_each_entry_safe(entry, lentry, &free_list, free_g_list) {
- fsnotify_destroy_mark_by_entry(entry);
- fsnotify_put_mark(entry);
- }
}
/*
*/
void fsnotify_clear_marks_by_inode(struct inode *inode)
{
- struct fsnotify_mark_entry *entry, *lentry;
+ struct fsnotify_mark *mark, *lmark;
struct hlist_node *pos, *n;
LIST_HEAD(free_list);
spin_lock(&inode->i_lock);
- hlist_for_each_entry_safe(entry, pos, n, &inode->i_fsnotify_mark_entries, i_list) {
- list_add(&entry->free_i_list, &free_list);
- hlist_del_init(&entry->i_list);
- fsnotify_get_mark(entry);
+ hlist_for_each_entry_safe(mark, pos, n, &inode->i_fsnotify_marks, i.i_list) {
+ list_add(&mark->i.free_i_list, &free_list);
+ hlist_del_init_rcu(&mark->i.i_list);
+ fsnotify_get_mark(mark);
}
spin_unlock(&inode->i_lock);
- list_for_each_entry_safe(entry, lentry, &free_list, free_i_list) {
- fsnotify_destroy_mark_by_entry(entry);
- fsnotify_put_mark(entry);
+ list_for_each_entry_safe(mark, lmark, &free_list, i.free_i_list) {
+ fsnotify_destroy_mark(mark);
+ fsnotify_put_mark(mark);
}
}
+ /*
+ * Given a group clear all of the inode marks associated with that group.
+ */
+ void fsnotify_clear_inode_marks_by_group(struct fsnotify_group *group)
+ {
+ fsnotify_clear_marks_by_group_flags(group, FSNOTIFY_MARK_FLAG_INODE);
+ }
+
/*
* given a group and inode, find the mark associated with that combination.
* if found take a reference to that mark and return it, else return NULL
*/
- struct fsnotify_mark_entry *fsnotify_find_mark_entry(struct fsnotify_group *group,
- struct inode *inode)
+ struct fsnotify_mark *fsnotify_find_inode_mark_locked(struct fsnotify_group *group,
+ struct inode *inode)
{
- struct fsnotify_mark_entry *entry;
+ struct fsnotify_mark *mark;
struct hlist_node *pos;
assert_spin_locked(&inode->i_lock);
- hlist_for_each_entry(entry, pos, &inode->i_fsnotify_mark_entries, i_list) {
- if (entry->group == group) {
- fsnotify_get_mark(entry);
- return entry;
+ hlist_for_each_entry(mark, pos, &inode->i_fsnotify_marks, i.i_list) {
+ if (mark->group == group) {
+ fsnotify_get_mark(mark);
+ return mark;
}
}
return NULL;
}
/*
- * Nothing fancy, just initialize lists and locks and counters.
+ * given a group and inode, find the mark associated with that combination.
+ * if found take a reference to that mark and return it, else return NULL
*/
- void fsnotify_init_mark(struct fsnotify_mark_entry *entry,
- void (*free_mark)(struct fsnotify_mark_entry *entry))
+ struct fsnotify_mark *fsnotify_find_inode_mark(struct fsnotify_group *group,
+ struct inode *inode)
+ {
+ struct fsnotify_mark *mark;
+
+ spin_lock(&inode->i_lock);
+ mark = fsnotify_find_inode_mark_locked(group, inode);
+ spin_unlock(&inode->i_lock);
+
+ return mark;
+ }
+ /*
+ * If we are setting a mark mask on an inode mark we should pin the inode
+ * in memory.
+ */
+ void fsnotify_set_inode_mark_mask_locked(struct fsnotify_mark *mark,
+ __u32 mask)
{
- spin_lock_init(&entry->lock);
- atomic_set(&entry->refcnt, 1);
- INIT_HLIST_NODE(&entry->i_list);
- entry->group = NULL;
- entry->mask = 0;
- entry->inode = NULL;
- entry->free_mark = free_mark;
+ struct inode *inode;
+
+ assert_spin_locked(&mark->lock);
+
+ if (mask &&
+ mark->i.inode &&
+ !(mark->flags & FSNOTIFY_MARK_FLAG_OBJECT_PINNED)) {
+ mark->flags |= FSNOTIFY_MARK_FLAG_OBJECT_PINNED;
+ inode = igrab(mark->i.inode);
+ /*
+ * we shouldn't be able to get here if the inode wasn't
+ * already safely held in memory. But bug in case it
+ * ever is wrong.
+ */
+ BUG_ON(!inode);
+ }
}
/*
- * Attach an initialized mark entry to a given group and inode.
+ * Attach an initialized mark to a given inode.
* These marks may be used for the fsnotify backend to determine which
- * event types should be delivered to which group and for which inodes.
+ * event types should be delivered to which group and for which inodes. These
+ * marks are ordered according to the group's location in memory.
*/
- int fsnotify_add_mark(struct fsnotify_mark_entry *entry,
- struct fsnotify_group *group, struct inode *inode)
+ int fsnotify_add_inode_mark(struct fsnotify_mark *mark,
+ struct fsnotify_group *group, struct inode *inode,
+ int allow_dups)
{
- struct fsnotify_mark_entry *lentry;
+ struct fsnotify_mark *lmark;
+ struct hlist_node *node, *last = NULL;
int ret = 0;
- inode = igrab(inode);
- if (unlikely(!inode))
- return -EINVAL;
+ mark->flags |= FSNOTIFY_MARK_FLAG_INODE;
+
+ assert_spin_locked(&mark->lock);
+ assert_spin_locked(&group->mark_lock);
- /*
- * LOCKING ORDER!!!!
- * entry->lock
- * group->mark_lock
- * inode->i_lock
- */
- spin_lock(&entry->lock);
- spin_lock(&group->mark_lock);
spin_lock(&inode->i_lock);
- lentry = fsnotify_find_mark_entry(group, inode);
- if (!lentry) {
- entry->group = group;
- entry->inode = inode;
+ mark->i.inode = inode;
- hlist_add_head(&entry->i_list, &inode->i_fsnotify_mark_entries);
- list_add(&entry->g_list, &group->mark_entries);
+ /* is mark the first mark? */
+ if (hlist_empty(&inode->i_fsnotify_marks)) {
+ hlist_add_head_rcu(&mark->i.i_list, &inode->i_fsnotify_marks);
+ goto out;
+ }
- fsnotify_get_mark(entry); /* for i_list and g_list */
+ /* should mark be in the middle of the current list? */
+ hlist_for_each_entry(lmark, node, &inode->i_fsnotify_marks, i.i_list) {
+ last = node;
- atomic_inc(&group->num_marks);
+ if ((lmark->group == group) && !allow_dups) {
+ ret = -EEXIST;
+ goto out;
+ }
+
+ if (mark->group < lmark->group)
+ continue;
- fsnotify_recalc_inode_mask_locked(inode);
+ hlist_add_before_rcu(&mark->i.i_list, &lmark->i.i_list);
+ goto out;
}
+ BUG_ON(last == NULL);
+ /* mark should be the last entry. last is the current last entry */
+ hlist_add_after_rcu(last, &mark->i.i_list);
+ out:
+ fsnotify_recalc_inode_mask_locked(inode);
spin_unlock(&inode->i_lock);
- spin_unlock(&group->mark_lock);
- spin_unlock(&entry->lock);
-
- if (lentry) {
- ret = -EEXIST;
- iput(inode);
- fsnotify_put_mark(lentry);
- } else {
- __fsnotify_update_child_dentry_flags(inode);
- }
return ret;
}
struct inode *need_iput_tmp;
/*
- * We cannot __iget() an inode in state I_CLEAR, I_FREEING,
+ * We cannot __iget() an inode in state I_FREEING,
* I_WILL_FREE, or I_NEW which is fine because by that point
* the inode cannot have any associated watches.
*/
- if (inode->i_state & (I_CLEAR|I_FREEING|I_WILL_FREE|I_NEW))
+ if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW))
continue;
/*
/* In case the dropping of a reference would nuke next_i. */
if ((&next_i->i_sb_list != list) &&
atomic_read(&next_i->i_count) &&
- !(next_i->i_state & (I_CLEAR | I_FREEING | I_WILL_FREE))) {
+ !(next_i->i_state & (I_FREEING | I_WILL_FREE))) {
__iget(next_i);
need_iput = next_i;
}
#include <linux/falloc.h>
#include <linux/fs_struct.h>
#include <linux/ima.h>
+ #include <linux/dnotify.h>
#include "internal.h"
error = locks_verify_truncate(inode, NULL, length);
if (!error)
- error = security_path_truncate(&path, length, 0);
+ error = security_path_truncate(&path);
if (!error)
error = do_truncate(path.dentry, length, 0, NULL);
error = locks_verify_truncate(inode, file, length);
if (!error)
- error = security_path_truncate(&file->f_path, length,
- ATTR_MTIME|ATTR_CTIME);
+ error = security_path_truncate(&file->f_path);
if (!error)
error = do_truncate(dentry, length, ATTR_MTIME|ATTR_CTIME, file);
out_putf:
if (error)
goto out;
- error = inode_permission(path.dentry->d_inode, MAY_EXEC | MAY_ACCESS);
+ error = inode_permission(path.dentry->d_inode, MAY_EXEC | MAY_CHDIR);
if (error)
goto dput_and_out;
if (!S_ISDIR(inode->i_mode))
goto out_putf;
- error = inode_permission(inode, MAY_EXEC | MAY_ACCESS);
+ error = inode_permission(inode, MAY_EXEC | MAY_CHDIR);
if (!error)
set_fs_pwd(current->fs, &file->f_path);
out_putf:
if (error)
goto out;
- error = inode_permission(path.dentry->d_inode, MAY_EXEC | MAY_ACCESS);
+ error = inode_permission(path.dentry->d_inode, MAY_EXEC | MAY_CHDIR);
if (error)
goto dput_and_out;
put_unused_fd(fd);
fd = PTR_ERR(f);
} else {
- fsnotify_open(f->f_path.dentry);
+ fsnotify_open(f);
fd_install(fd, f);
}
}
unifdef-y += eventpoll.h
unifdef-y += signalfd.h
unifdef-y += ext2_fs.h
+ unifdef-y += fanotify.h
unifdef-y += fb.h
unifdef-y += fcntl.h
unifdef-y += filter.h
$(srctree)/include/asm-$(SRCARCH)/kvm_para.h),)
unifdef-y += kvm_para.h
endif
+unifdef-y += l2tp.h
unifdef-y += llc.h
unifdef-y += loop.h
unifdef-y += lp.h
#define MAY_APPEND 8
#define MAY_ACCESS 16
#define MAY_OPEN 32
+#define MAY_CHDIR 64
/*
* flags in file.f_mode. Note that FMODE_READ and FMODE_WRITE must correspond
/* Expect random access pattern */
#define FMODE_RANDOM ((__force fmode_t)0x1000)
+ /* File was opened by fanotify and shouldn't generate fanotify events */
+ #define FMODE_NONOTIFY ((__force fmode_t)16777216) /* 0x1000000 */
+
/*
* The below are the various read and write types that we support. Some of
* them include behavioral modifiers that send information down to the
#define MS_KERNMOUNT (1<<22) /* this is a kern_mount call */
#define MS_I_VERSION (1<<23) /* Update inode I_version field */
#define MS_STRICTATIME (1<<24) /* Always perform atime updates */
+#define MS_BORN (1<<29)
#define MS_ACTIVE (1<<30)
#define MS_NOUSER (1<<31)
extern int sysctl_nr_open;
extern struct inodes_stat_t inodes_stat;
extern int leases_enable, lease_break_time;
- #ifdef CONFIG_DNOTIFY
- extern int dir_notify_enable;
- #endif
struct buffer_head;
typedef int (get_block_t)(struct inode *inode, sector_t iblock,
struct buffer_head *bh_result, int create);
typedef void (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
- ssize_t bytes, void *private);
+ ssize_t bytes, void *private, int ret,
+ bool is_async);
/*
* Attribute flags. These should be or-ed together to figure out what
*/
#define PAGECACHE_TAG_DIRTY 0
#define PAGECACHE_TAG_WRITEBACK 1
+#define PAGECACHE_TAG_TOWRITE 2
int mapping_tagged(struct address_space *mapping, int tag);
#ifdef CONFIG_FSNOTIFY
__u32 i_fsnotify_mask; /* all events this inode cares about */
- struct hlist_head i_fsnotify_mark_entries; /* fsnotify mark entries */
- #endif
-
- #ifdef CONFIG_INOTIFY
- struct list_head inotify_watches; /* watches on this inode */
- struct mutex inotify_mutex; /* protects the watches list */
+ struct hlist_head i_fsnotify_marks;
#endif
unsigned long i_state;
void (*dirty_inode) (struct inode *);
int (*write_inode) (struct inode *, struct writeback_control *wbc);
- void (*drop_inode) (struct inode *);
- void (*delete_inode) (struct inode *);
+ int (*drop_inode) (struct inode *);
+ void (*evict_inode) (struct inode *);
void (*put_super) (struct super_block *);
void (*write_super) (struct super_block *);
int (*sync_fs)(struct super_block *sb, int wait);
int (*unfreeze_fs) (struct super_block *);
int (*statfs) (struct dentry *, struct kstatfs *);
int (*remount_fs) (struct super_block *, int *, char *);
- void (*clear_inode) (struct inode *);
void (*umount_begin) (struct super_block *);
int (*show_options)(struct seq_file *, struct vfsmount *);
* I_FREEING Set when inode is about to be freed but still has dirty
* pages or buffers attached or the inode itself is still
* dirty.
- * I_CLEAR Set by clear_inode(). In this state the inode is clean
- * and can be destroyed.
+ * I_CLEAR Added by end_writeback(). In this state the inode is clean
+ * and can be destroyed. Inode keeps I_FREEING.
*
* Inodes that are I_WILL_FREE, I_FREEING or I_CLEAR are
* prohibited for many purposes. iget() must wait for
extern void drop_collected_mounts(struct vfsmount *);
extern int iterate_mounts(int (*)(struct vfsmount *, void *), void *,
struct vfsmount *);
-extern int vfs_statfs(struct dentry *, struct kstatfs *);
+extern int vfs_statfs(struct path *, struct kstatfs *);
+extern int statfs_by_dentry(struct dentry *, struct kstatfs *);
extern int freeze_super(struct super_block *super);
extern int thaw_super(struct super_block *super);
extern struct inode * igrab(struct inode *);
extern ino_t iunique(struct super_block *, ino_t);
extern int inode_needs_sync(struct inode *inode);
-extern void generic_delete_inode(struct inode *inode);
-extern void generic_drop_inode(struct inode *inode);
-extern int generic_detach_inode(struct inode *inode);
+extern int generic_delete_inode(struct inode *inode);
+extern int generic_drop_inode(struct inode *inode);
extern struct inode *ilookup5_nowait(struct super_block *sb,
unsigned long hashval, int (*test)(struct inode *, void *),
extern void __iget(struct inode * inode);
extern void iget_failed(struct inode *);
-extern void clear_inode(struct inode *);
+extern void end_writeback(struct inode *);
extern void destroy_inode(struct inode *);
extern void __destroy_inode(struct inode *);
extern struct inode *new_inode(struct super_block *);
struct bio;
typedef void (dio_submit_t)(int rw, struct bio *bio, struct inode *inode,
loff_t file_offset);
-void dio_end_io(struct bio *bio, int error);
-
-ssize_t __blockdev_direct_IO_newtrunc(int rw, struct kiocb *iocb, struct inode *inode,
- struct block_device *bdev, const struct iovec *iov, loff_t offset,
- unsigned long nr_segs, get_block_t get_block, dio_iodone_t end_io,
- dio_submit_t submit_io, int lock_type);
-ssize_t __blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode,
- struct block_device *bdev, const struct iovec *iov, loff_t offset,
- unsigned long nr_segs, get_block_t get_block, dio_iodone_t end_io,
- dio_submit_t submit_io, int lock_type);
enum {
/* need locking between buffered and direct access */
DIO_SKIP_HOLES = 0x02,
};
-static inline ssize_t blockdev_direct_IO_newtrunc(int rw, struct kiocb *iocb,
- struct inode *inode, struct block_device *bdev, const struct iovec *iov,
- loff_t offset, unsigned long nr_segs, get_block_t get_block,
- dio_iodone_t end_io)
-{
- return __blockdev_direct_IO_newtrunc(rw, iocb, inode, bdev, iov, offset,
- nr_segs, get_block, end_io, NULL,
- DIO_LOCKING | DIO_SKIP_HOLES);
-}
+void dio_end_io(struct bio *bio, int error);
+
+ssize_t __blockdev_direct_IO(int rw, struct kiocb *iocb, struct inode *inode,
+ struct block_device *bdev, const struct iovec *iov, loff_t offset,
+ unsigned long nr_segs, get_block_t get_block, dio_iodone_t end_io,
+ dio_submit_t submit_io, int flags);
-static inline ssize_t blockdev_direct_IO_no_locking_newtrunc(int rw, struct kiocb *iocb,
- struct inode *inode, struct block_device *bdev, const struct iovec *iov,
- loff_t offset, unsigned long nr_segs, get_block_t get_block,
- dio_iodone_t end_io)
-{
- return __blockdev_direct_IO_newtrunc(rw, iocb, inode, bdev, iov, offset,
- nr_segs, get_block, end_io, NULL, 0);
-}
static inline ssize_t blockdev_direct_IO(int rw, struct kiocb *iocb,
struct inode *inode, struct block_device *bdev, const struct iovec *iov,
loff_t offset, unsigned long nr_segs, get_block_t get_block,
nr_segs, get_block, end_io, NULL,
DIO_LOCKING | DIO_SKIP_HOLES);
}
-
-static inline ssize_t blockdev_direct_IO_no_locking(int rw, struct kiocb *iocb,
- struct inode *inode, struct block_device *bdev, const struct iovec *iov,
- loff_t offset, unsigned long nr_segs, get_block_t get_block,
- dio_iodone_t end_io)
-{
- return __blockdev_direct_IO(rw, iocb, inode, bdev, iov, offset,
- nr_segs, get_block, end_io, NULL, 0);
-}
#endif
extern const struct file_operations generic_ro_fops;
extern int simple_unlink(struct inode *, struct dentry *);
extern int simple_rmdir(struct inode *, struct dentry *);
extern int simple_rename(struct inode *, struct dentry *, struct inode *, struct dentry *);
-extern int simple_setsize(struct inode *, loff_t);
extern int noop_fsync(struct file *, int);
extern int simple_empty(struct dentry *);
extern int simple_readpage(struct file *file, struct page *page);
extern int inode_change_ok(const struct inode *, struct iattr *);
extern int inode_newsize_ok(const struct inode *, loff_t offset);
-extern int __must_check inode_setattr(struct inode *, const struct iattr *);
-extern void generic_setattr(struct inode *inode, const struct iattr *attr);
+extern void setattr_copy(struct inode *inode, const struct iattr *attr);
extern void file_update_time(struct file *file);
int __init get_filesystem_list(char *buf);
#define ACC_MODE(x) ("\004\002\006\006"[(x)&O_ACCMODE])
- #define OPEN_FMODE(flag) ((__force fmode_t)((flag + 1) & O_ACCMODE))
+ #define OPEN_FMODE(flag) ((__force fmode_t)(((flag + 1) & O_ACCMODE) | \
+ (flag & FMODE_NONOTIFY)))
#endif /* __KERNEL__ */
#endif /* _LINUX_FS_H */
#define __LINUX_SECURITY_H
#include <linux/fs.h>
+ #include <linux/fsnotify.h>
#include <linux/binfmts.h>
#include <linux/signal.h>
#include <linux/resource.h>
* @path_truncate:
* Check permission before truncating a file.
* @path contains the path structure for the file.
- * @length is the new length of the file.
- * @time_attrs is the flags passed to do_truncate().
* Return 0 if permission is granted.
* @inode_getattr:
* Check permission before obtaining file attributes.
int (*path_rmdir) (struct path *dir, struct dentry *dentry);
int (*path_mknod) (struct path *dir, struct dentry *dentry, int mode,
unsigned int dev);
- int (*path_truncate) (struct path *path, loff_t length,
- unsigned int time_attrs);
+ int (*path_truncate) (struct path *path);
int (*path_symlink) (struct path *dir, struct dentry *dentry,
const char *old_name);
int (*path_link) (struct dentry *old_dentry, struct path *new_dir,
int security_path_rmdir(struct path *dir, struct dentry *dentry);
int security_path_mknod(struct path *dir, struct dentry *dentry, int mode,
unsigned int dev);
-int security_path_truncate(struct path *path, loff_t length,
- unsigned int time_attrs);
+int security_path_truncate(struct path *path);
int security_path_symlink(struct path *dir, struct dentry *dentry,
const char *old_name);
int security_path_link(struct dentry *old_dentry, struct path *new_dir,
return 0;
}
-static inline int security_path_truncate(struct path *path, loff_t length,
- unsigned int time_attrs)
+static inline int security_path_truncate(struct path *path)
{
return 0;
}
.enter_event = &event_enter_##sname, \
.exit_event = &event_exit_##sname, \
.enter_fields = LIST_HEAD_INIT(__syscall_meta_##sname.enter_fields), \
- .exit_fields = LIST_HEAD_INIT(__syscall_meta_##sname.exit_fields), \
};
#define SYSCALL_DEFINE0(sname) \
.enter_event = &event_enter__##sname, \
.exit_event = &event_exit__##sname, \
.enter_fields = LIST_HEAD_INIT(__syscall_meta__##sname.enter_fields), \
- .exit_fields = LIST_HEAD_INIT(__syscall_meta__##sname.exit_fields), \
}; \
asmlinkage long sys_##sname(void)
#else
asmlinkage long sys_ppoll(struct pollfd __user *, unsigned int,
struct timespec __user *, const sigset_t __user *,
size_t);
+ asmlinkage long sys_fanotify_init(unsigned int flags, unsigned int event_f_flags);
+ asmlinkage long sys_fanotify_mark(int fanotify_fd, unsigned int flags,
+ u64 mask, int fd,
+ const char __user *pathname);
int kernel_execve(const char *filename, char *const argv[], char *const envp[]);
help
Enable low-overhead system-call auditing infrastructure that
can be used independently or with another kernel subsystem,
- such as SELinux. To use audit's filesystem watch feature, please
- ensure that INOTIFY is configured.
+ such as SELinux.
+
+ config AUDIT_WATCH
+ def_bool y
+ depends on AUDITSYSCALL
+ select FSNOTIFY
config AUDIT_TREE
def_bool y
depends on AUDITSYSCALL
- select INOTIFY
+ select FSNOTIFY
menu "RCU Subsystem"
source "arch/Kconfig"
-config SLOW_WORK
- default n
- bool
- help
- The slow work thread pool provides a number of dynamically allocated
- threads that can be used by the kernel to perform operations that
- take a relatively long time.
-
- An example of this would be CacheFiles doing a path lookup followed
- by a series of mkdirs and a create call, all of which have to touch
- disk.
-
- See Documentation/slow-work.txt.
-
-config SLOW_WORK_DEBUG
- bool "Slow work debugging through debugfs"
- default n
- depends on SLOW_WORK && DEBUG_FS
- help
- Display the contents of the slow work run queue through debugfs,
- including items currently executing.
-
- See Documentation/slow-work.txt.
-
endmenu # General setup
config HAVE_GENERIC_DMA_COHERENT
obj-$(CONFIG_RESOURCE_COUNTERS) += res_counter.o
obj-$(CONFIG_SMP) += stop_machine.o
obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
- obj-$(CONFIG_AUDIT) += audit.o auditfilter.o audit_watch.o
+ obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
obj-$(CONFIG_AUDITSYSCALL) += auditsc.o
- obj-$(CONFIG_GCOV_KERNEL) += gcov/
+ obj-$(CONFIG_AUDIT_WATCH) += audit_watch.o
obj-$(CONFIG_AUDIT_TREE) += audit_tree.o
+ obj-$(CONFIG_GCOV_KERNEL) += gcov/
obj-$(CONFIG_KPROBES) += kprobes.o
obj-$(CONFIG_KGDB) += debug/
-obj-$(CONFIG_DETECT_SOFTLOCKUP) += softlockup.o
obj-$(CONFIG_DETECT_HUNG_TASK) += hung_task.o
+obj-$(CONFIG_LOCKUP_DETECTOR) += watchdog.o
obj-$(CONFIG_GENERIC_HARDIRQS) += irq/
obj-$(CONFIG_SECCOMP) += seccomp.o
obj-$(CONFIG_RCU_TORTURE_TEST) += rcutorture.o
obj-$(CONFIG_X86_DS) += trace/
obj-$(CONFIG_RING_BUFFER) += trace/
obj-$(CONFIG_SMP) += sched_cpupri.o
-obj-$(CONFIG_SLOW_WORK) += slow-work.o
-obj-$(CONFIG_SLOW_WORK_DEBUG) += slow-work-debugfs.o
obj-$(CONFIG_PERF_EVENTS) += perf_event.o
obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
obj-$(CONFIG_USER_RETURN_NOTIFIER) += user-return-notifier.o
#include <net/netlink.h>
#include <linux/skbuff.h>
#include <linux/netlink.h>
- #include <linux/inotify.h>
#include <linux/freezer.h>
#include <linux/tty.h>
audit_hold_skb(skb);
} else
/* drop the extra reference if sent ok */
- kfree_skb(skb);
+ consume_skb(skb);
}
static int kauditd_thread(void *dummy)
#include <linux/times.h>
#include <linux/limits.h>
#include <linux/dcache.h>
+ #include <linux/dnotify.h>
#include <linux/syscalls.h>
#include <linux/vmstat.h>
#include <linux/nfs_fs.h>
#include <linux/acpi.h>
#include <linux/reboot.h>
#include <linux/ftrace.h>
-#include <linux/slow-work.h>
#include <linux/perf_event.h>
#include <linux/kprobes.h>
#include <linux/pipe_fs_i.h>
+#include <linux/oom.h>
#include <asm/uaccess.h>
#include <asm/processor.h>
#include <scsi/sg.h>
#endif
+#ifdef CONFIG_LOCKUP_DETECTOR
+#include <linux/nmi.h>
+#endif
+
#if defined(CONFIG_SYSCTL)
/* External variables not in a header file. */
extern int sysctl_overcommit_memory;
extern int sysctl_overcommit_ratio;
-extern int sysctl_panic_on_oom;
-extern int sysctl_oom_kill_allocating_task;
-extern int sysctl_oom_dump_tasks;
extern int max_threads;
extern int core_uses_pid;
extern int suid_dumpable;
#endif
/* Constants used for minimum and maximum */
-#ifdef CONFIG_DETECT_SOFTLOCKUP
+#ifdef CONFIG_LOCKUP_DETECTOR
static int sixty = 60;
static int neg_one = -1;
#endif
static int ngroups_max = NGROUPS_MAX;
+ #ifdef CONFIG_INOTIFY_USER
+ #include <linux/inotify.h>
+ #endif
#ifdef CONFIG_SPARC
#include <asm/system.h>
#endif
static struct ctl_table debug_table[];
static struct ctl_table dev_table[];
extern struct ctl_table random_table[];
- #ifdef CONFIG_INOTIFY_USER
- extern struct ctl_table inotify_table[];
- #endif
#ifdef CONFIG_EPOLL
extern struct ctl_table epoll_table[];
#endif
.extra2 = &one,
},
#endif
-#if defined(CONFIG_HOTPLUG) && defined(CONFIG_NET)
+#ifdef CONFIG_HOTPLUG
{
.procname = "hotplug",
.data = &uevent_helper,
.mode = 0444,
.proc_handler = proc_dointvec,
},
-#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
+#if defined(CONFIG_LOCKUP_DETECTOR)
+ {
+ .procname = "watchdog",
+ .data = &watchdog_enabled,
+ .maxlen = sizeof (int),
+ .mode = 0644,
+ .proc_handler = proc_dowatchdog_enabled,
+ },
+ {
+ .procname = "watchdog_thresh",
+ .data = &softlockup_thresh,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dowatchdog_thresh,
+ .extra1 = &neg_one,
+ .extra2 = &sixty,
+ },
+ {
+ .procname = "softlockup_panic",
+ .data = &softlockup_panic,
+ .maxlen = sizeof(int),
+ .mode = 0644,
+ .proc_handler = proc_dointvec_minmax,
+ .extra1 = &zero,
+ .extra2 = &one,
+ },
+#endif
+#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86) && !defined(CONFIG_LOCKUP_DETECTOR)
{
.procname = "unknown_nmi_panic",
.data = &unknown_nmi_panic,
.proc_handler = proc_dointvec,
},
#endif
-#ifdef CONFIG_DETECT_SOFTLOCKUP
- {
- .procname = "softlockup_panic",
- .data = &softlockup_panic,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec_minmax,
- .extra1 = &zero,
- .extra2 = &one,
- },
- {
- .procname = "softlockup_thresh",
- .data = &softlockup_thresh,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dosoftlockup_thresh,
- .extra1 = &neg_one,
- .extra2 = &sixty,
- },
-#endif
#ifdef CONFIG_DETECT_HUNG_TASK
{
.procname = "hung_task_panic",
.proc_handler = proc_dointvec,
},
#endif
-#ifdef CONFIG_SLOW_WORK
- {
- .procname = "slow-work",
- .mode = 0555,
- .child = slow_work_sysctls,
- },
-#endif
#ifdef CONFIG_PERF_EVENTS
{
.procname = "perf_event_paranoid",
new_dentry);
}
-int security_path_truncate(struct path *path, loff_t length,
- unsigned int time_attrs)
+int security_path_truncate(struct path *path)
{
if (unlikely(IS_PRIVATE(path->dentry->d_inode)))
return 0;
- return security_ops->path_truncate(path, length, time_attrs);
+ return security_ops->path_truncate(path);
}
int security_path_chmod(struct dentry *dentry, struct vfsmount *mnt,
int security_file_permission(struct file *file, int mask)
{
- return security_ops->file_permission(file, mask);
+ int ret;
+
+ ret = security_ops->file_permission(file, mask);
+ if (ret)
+ return ret;
+
+ return fsnotify_perm(file, mask);
}
int security_file_alloc(struct file *file)
int security_dentry_open(struct file *file, const struct cred *cred)
{
- return security_ops->dentry_open(file, cred);
+ int ret;
+
+ ret = security_ops->dentry_open(file, cred);
+ if (ret)
+ return ret;
+
+ return fsnotify_perm(file, MAY_OPEN);
}
int security_task_create(unsigned long clone_flags)