Documentation/CodingStyle

   1 Linux kernel coding style
   2 =========================
   3
   4 This is a short document describing the preferred coding style for the
   5 linux kernel.  Coding style is very personal, and I won't **force** my
   6 views on anybody, but this is what goes for anything that I have to be
   7 able to maintain, and I'd prefer it for most other things too.  Please
   8 at least consider the points made here.
   9
  10 First off, I'd suggest printing out a copy of the GNU coding standards,
  11 and NOT read it.  Burn them, it's a great symbolic gesture.
  12
  13 Anyway, here goes:
  14
  15
  16 1) Indentation
  17 --------------
  18
  19 Tabs are 8 characters, and thus indentations are also 8 characters.
  20 There are heretic movements that try to make indentations 4 (or even 2!)
  21 characters deep, and that is akin to trying to define the value of PI to
  22 be 3.
  23
  24 Rationale: The whole idea behind indentation is to clearly define where
  25 a block of control starts and ends.  Especially when you've been looking
  26 at your screen for 20 straight hours, you'll find it a lot easier to see
  27 how the indentation works if you have large indentations.
  28
  29 Now, some people will claim that having 8-character indentations makes
  30 the code move too far to the right, and makes it hard to read on a
  31 80-character terminal screen.  The answer to that is that if you need
  32 more than 3 levels of indentation, you're screwed anyway, and should fix
  33 your program.
  34
  35 In short, 8-char indents make things easier to read, and have the added
  36 benefit of warning you when you're nesting your functions too deep.
  37 Heed that warning.
  38
  39 The preferred way to ease multiple indentation levels in a switch statement is
  40 to align the ``switch`` and its subordinate ``case`` labels in the same column
  41 instead of ``double-indenting`` the ``case`` labels.  E.g.:
  42
  43 .. code-block:: c
  44
  45         switch (suffix) {
  46         case 'G':
  47         case 'g':
  48                 mem <<= 30;
  49                 break;
  50         case 'M':
  51         case 'm':
  52                 mem <<= 20;
  53                 break;
  54         case 'K':
  55         case 'k':
  56                 mem <<= 10;
  57                 /* fall through */
  58         default:
  59                 break;
  60         }
  61
  62 Don't put multiple statements on a single line unless you have
  63 something to hide:
  64
  65 .. code-block:: c
  66
  67         if (condition) do_this;
  68           do_something_everytime;
  69
  70 Don't put multiple assignments on a single line either.  Kernel coding style
  71 is super simple.  Avoid tricky expressions.
  72
  73 Outside of comments, documentation and except in Kconfig, spaces are never
  74 used for indentation, and the above example is deliberately broken.
  75
  76 Get a decent editor and don't leave whitespace at the end of lines.
  77
  78
  79 2) Breaking long lines and strings
  80 ----------------------------------
  81
  82 Coding style is all about readability and maintainability using commonly
  83 available tools.
  84
  85 The limit on the length of lines is 80 columns and this is a strongly
  86 preferred limit.
  87
  88 Statements longer than 80 columns will be broken into sensible chunks, unless
  89 exceeding 80 columns significantly increases readability and does not hide
  90 information. Descendants are always substantially shorter than the parent and
  91 are placed substantially to the right. The same applies to function headers
  92 with a long argument list. However, never break user-visible strings such as
  93 printk messages, because that breaks the ability to grep for them.
  94
  95
  96 3) Placing Braces and Spaces
  97 ----------------------------
  98
  99 The other issue that always comes up in C styling is the placement of
 100 braces.  Unlike the indent size, there are few technical reasons to
 101 choose one placement strategy over the other, but the preferred way, as
 102 shown to us by the prophets Kernighan and Ritchie, is to put the opening
 103 brace last on the line, and put the closing brace first, thusly:
 104
 105 .. code-block:: c
 106
 107         if (x is true) {
 108                 we do y
 109         }
 110
 111 This applies to all non-function statement blocks (if, switch, for,
 112 while, do).  E.g.:
 113
 114 .. code-block:: c
 115
 116         switch (action) {
 117         case KOBJ_ADD:
 118                 return "add";
 119         case KOBJ_REMOVE:
 120                 return "remove";
 121         case KOBJ_CHANGE:
 122                 return "change";
 123         default:
 124                 return NULL;
 125         }
 126
 127 However, there is one special case, namely functions: they have the
 128 opening brace at the beginning of the next line, thus:
 129
 130 .. code-block:: c
 131
 132         int function(int x)
 133         {
 134                 body of function
 135         }
 136
 137 Heretic people all over the world have claimed that this inconsistency
 138 is ...  well ...  inconsistent, but all right-thinking people know that
 139 (a) K&R are **right** and (b) K&R are right.  Besides, functions are
 140 special anyway (you can't nest them in C).
 141
 142 Note that the closing brace is empty on a line of its own, **except** in
 143 the cases where it is followed by a continuation of the same statement,
 144 ie a ``while`` in a do-statement or an ``else`` in an if-statement, like
 145 this:
 146
 147 .. code-block:: c
 148
 149         do {
 150                 body of do-loop
 151         } while (condition);
 152
 153 and
 154
 155 .. code-block:: c
 156
 157         if (x == y) {
 158                 ..
 159         } else if (x > y) {
 160                 ...
 161         } else {
 162                 ....
 163         }
 164
 165 Rationale: K&R.
 166
 167 Also, note that this brace-placement also minimizes the number of empty
 168 (or almost empty) lines, without any loss of readability.  Thus, as the
 169 supply of new-lines on your screen is not a renewable resource (think
 170 25-line terminal screens here), you have more empty lines to put
 171 comments on.
 172
 173 Do not unnecessarily use braces where a single statement will do.
 174
 175 .. code-block:: c
 176
 177         if (condition)
 178                 action();
 179
 180 and
 181
 182 .. code-block:: none
 183
 184         if (condition)
 185                 do_this();
 186         else
 187                 do_that();
 188
 189 This does not apply if only one branch of a conditional statement is a single
 190 statement; in the latter case use braces in both branches:
 191
 192 .. code-block:: c
 193
 194         if (condition) {
 195                 do_this();
 196                 do_that();
 197         } else {
 198                 otherwise();
 199         }
 200
 201 3.1) Spaces
 202 ***********
 203
 204 Linux kernel style for use of spaces depends (mostly) on
 205 function-versus-keyword usage.  Use a space after (most) keywords.  The
 206 notable exceptions are sizeof, typeof, alignof, and __attribute__, which look
 207 somewhat like functions (and are usually used with parentheses in Linux,
 208 although they are not required in the language, as in: ``sizeof info`` after
 209 ``struct fileinfo info;`` is declared).
 210
 211 So use a space after these keywords::
 212
 213         if, switch, case, for, do, while
 214
 215 but not with sizeof, typeof, alignof, or __attribute__.  E.g.,
 216
 217 .. code-block:: c
 218
 219
 220         s = sizeof(struct file);
 221
 222 Do not add spaces around (inside) parenthesized expressions.  This example is
 223 **bad**:
 224
 225 .. code-block:: c
 226
 227
 228         s = sizeof( struct file );
 229
 230 When declaring pointer data or a function that returns a pointer type, the
 231 preferred use of ``*`` is adjacent to the data name or function name and not
 232 adjacent to the type name.  Examples:
 233
 234 .. code-block:: c
 235
 236
 237         char *linux_banner;
 238         unsigned long long memparse(char *ptr, char **retptr);
 239         char *match_strdup(substring_t *s);
 240
 241 Use one space around (on each side of) most binary and ternary operators,
 242 such as any of these::
 243
 244         =  +  -  <  >  *  /  %  |  &  ^  <=  >=  ==  !=  ?  :
 245
 246 but no space after unary operators::
 247
 248         &  *  +  -  ~  !  sizeof  typeof  alignof  __attribute__  defined
 249
 250 no space before the postfix increment & decrement unary operators::
 251
 252         ++  --
 253
 254 no space after the prefix increment & decrement unary operators::
 255
 256         ++  --
 257
 258 and no space around the ``.`` and ``->`` structure member operators.
 259
 260 Do not leave trailing whitespace at the ends of lines.  Some editors with
 261 ``smart`` indentation will insert whitespace at the beginning of new lines as
 262 appropriate, so you can start typing the next line of code right away.
 263 However, some such editors do not remove the whitespace if you end up not
 264 putting a line of code there, such as if you leave a blank line.  As a result,
 265 you end up with lines containing trailing whitespace.
 266
 267 Git will warn you about patches that introduce trailing whitespace, and can
 268 optionally strip the trailing whitespace for you; however, if applying a series
 269 of patches, this may make later patches in the series fail by changing their
 270 context lines.
 271
 272
 273 4) Naming
 274 ---------
 275
 276 C is a Spartan language, and so should your naming be.  Unlike Modula-2
 277 and Pascal programmers, C programmers do not use cute names like
 278 ThisVariableIsATemporaryCounter.  A C programmer would call that
 279 variable ``tmp``, which is much easier to write, and not the least more
 280 difficult to understand.
 281
 282 HOWEVER, while mixed-case names are frowned upon, descriptive names for
 283 global variables are a must.  To call a global function ``foo`` is a
 284 shooting offense.
 285
 286 GLOBAL variables (to be used only if you **really** need them) need to
 287 have descriptive names, as do global functions.  If you have a function
 288 that counts the number of active users, you should call that
 289 ``count_active_users()`` or similar, you should **not** call it ``cntusr()``.
 290
 291 Encoding the type of a function into the name (so-called Hungarian
 292 notation) is brain damaged - the compiler knows the types anyway and can
 293 check those, and it only confuses the programmer.  No wonder MicroSoft
 294 makes buggy programs.
 295
 296 LOCAL variable names should be short, and to the point.  If you have
 297 some random integer loop counter, it should probably be called ``i``.
 298 Calling it ``loop_counter`` is non-productive, if there is no chance of it
 299 being mis-understood.  Similarly, ``tmp`` can be just about any type of
 300 variable that is used to hold a temporary value.
 301
 302 If you are afraid to mix up your local variable names, you have another
 303 problem, which is called the function-growth-hormone-imbalance syndrome.
 304 See chapter 6 (Functions).
 305
 306
 307 5) Typedefs
 308 -----------
 309
 310 Please don't use things like ``vps_t``.
 311 It's a **mistake** to use typedef for structures and pointers. When you see a
 312
 313 .. code-block:: c
 314
 315
 316         vps_t a;
 317
 318 in the source, what does it mean?
 319 In contrast, if it says
 320
 321 .. code-block:: c
 322
 323         struct virtual_container *a;
 324
 325 you can actually tell what ``a`` is.
 326
 327 Lots of people think that typedefs ``help readability``. Not so. They are
 328 useful only for:
 329
 330  (a) totally opaque objects (where the typedef is actively used to **hide**
 331      what the object is).
 332
 333      Example: ``pte_t`` etc. opaque objects that you can only access using
 334      the proper accessor functions.
 335
 336      .. note::
 337
 338        Opaqueness and ``accessor functions`` are not good in themselves.
 339        The reason we have them for things like pte_t etc. is that there
 340        really is absolutely **zero** portably accessible information there.
 341
 342  (b) Clear integer types, where the abstraction **helps** avoid confusion
 343      whether it is ``int`` or ``long``.
 344
 345      u8/u16/u32 are perfectly fine typedefs, although they fit into
 346      category (d) better than here.
 347
 348      .. note::
 349
 350        Again - there needs to be a **reason** for this. If something is
 351        ``unsigned long``, then there's no reason to do
 352
 353         typedef unsigned long myflags_t;
 354
 355      but if there is a clear reason for why it under certain circumstances
 356      might be an ``unsigned int`` and under other configurations might be
 357      ``unsigned long``, then by all means go ahead and use a typedef.
 358
 359  (c) when you use sparse to literally create a **new** type for
 360      type-checking.
 361
 362  (d) New types which are identical to standard C99 types, in certain
 363      exceptional circumstances.
 364
 365      Although it would only take a short amount of time for the eyes and
 366      brain to become accustomed to the standard types like ``uint32_t``,
 367      some people object to their use anyway.
 368
 369      Therefore, the Linux-specific ``u8/u16/u32/u64`` types and their
 370      signed equivalents which are identical to standard types are
 371      permitted -- although they are not mandatory in new code of your
 372      own.
 373
 374      When editing existing code which already uses one or the other set
 375      of types, you should conform to the existing choices in that code.
 376
 377  (e) Types safe for use in userspace.
 378
 379      In certain structures which are visible to userspace, we cannot
 380      require C99 types and cannot use the ``u32`` form above. Thus, we
 381      use __u32 and similar types in all structures which are shared
 382      with userspace.
 383
 384 Maybe there are other cases too, but the rule should basically be to NEVER
 385 EVER use a typedef unless you can clearly match one of those rules.
 386
 387 In general, a pointer, or a struct that has elements that can reasonably
 388 be directly accessed should **never** be a typedef.
 389
 390
 391 6) Functions
 392 ------------
 393
 394 Functions should be short and sweet, and do just one thing.  They should
 395 fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24,
 396 as we all know), and do one thing and do that well.
 397
 398 The maximum length of a function is inversely proportional to the
 399 complexity and indentation level of that function.  So, if you have a
 400 conceptually simple function that is just one long (but simple)
 401 case-statement, where you have to do lots of small things for a lot of
 402 different cases, it's OK to have a longer function.
 403
 404 However, if you have a complex function, and you suspect that a
 405 less-than-gifted first-year high-school student might not even
 406 understand what the function is all about, you should adhere to the
 407 maximum limits all the more closely.  Use helper functions with
 408 descriptive names (you can ask the compiler to in-line them if you think
 409 it's performance-critical, and it will probably do a better job of it
 410 than you would have done).
 411
 412 Another measure of the function is the number of local variables.  They
 413 shouldn't exceed 5-10, or you're doing something wrong.  Re-think the
 414 function, and split it into smaller pieces.  A human brain can
 415 generally easily keep track of about 7 different things, anything more
 416 and it gets confused.  You know you're brilliant, but maybe you'd like
 417 to understand what you did 2 weeks from now.
 418
 419 In source files, separate functions with one blank line.  If the function is
 420 exported, the **EXPORT** macro for it should follow immediately after the
 421 closing function brace line.  E.g.:
 422
 423 .. code-block:: c
 424
 425         int system_is_up(void)
 426         {
 427                 return system_state == SYSTEM_RUNNING;
 428         }
 429         EXPORT_SYMBOL(system_is_up);
 430
 431 In function prototypes, include parameter names with their data types.
 432 Although this is not required by the C language, it is preferred in Linux
 433 because it is a simple way to add valuable information for the reader.
 434
 435
 436 7) Centralized exiting of functions
 437 -----------------------------------
 438
 439 Albeit deprecated by some people, the equivalent of the goto statement is
 440 used frequently by compilers in form of the unconditional jump instruction.
 441
 442 The goto statement comes in handy when a function exits from multiple
 443 locations and some common work such as cleanup has to be done.  If there is no
 444 cleanup needed then just return directly.
 445
 446 Choose label names which say what the goto does or why the goto exists.  An
 447 example of a good name could be ``out_free_buffer:`` if the goto frees ``buffer``.
 448 Avoid using GW-BASIC names like ``err1:`` and ``err2:``, as you would have to
 449 renumber them if you ever add or remove exit paths, and they make correctness
 450 difficult to verify anyway.
 451
 452 It is advised to indent labels with a single space (not tab), so that
 453 ``diff -p`` does not confuse labels with functions.
 454
 455 The rationale for using gotos is:
 456
 457 - unconditional statements are easier to understand and follow
 458 - nesting is reduced
 459 - errors by not updating individual exit points when making
 460   modifications are prevented
 461 - saves the compiler work to optimize redundant code away ;)
 462
 463 .. code-block:: c
 464
 465         int fun(int a)
 466         {
 467                 int result = 0;
 468                 char *buffer;
 469
 470                 buffer = kmalloc(SIZE, GFP_KERNEL);
 471                 if (!buffer)
 472                         return -ENOMEM;
 473
 474                 if (condition1) {
 475                         while (loop1) {
 476                                 ...
 477                         }
 478                         result = 1;
 479                         goto out_buffer;
 480                 }
 481                 ...
 482          out_free_buffer:
 483                 kfree(buffer);
 484                 return result;
 485         }
 486
 487 A common type of bug to be aware of is ``one err bugs`` which look like this:
 488
 489 .. code-block:: c
 490
 491          err:
 492                 kfree(foo->bar);
 493                 kfree(foo);
 494                 return ret;
 495
 496 The bug in this code is that on some exit paths ``foo`` is NULL.  Normally the
 497 fix for this is to split it up into two error labels ``err_free_bar:`` and
 498 ``err_free_foo:``:
 499
 500 .. code-block:: c
 501
 502          err_free_bar:
 503                 kfree(foo->bar);
 504          err_free_foo:
 505                 kfree(foo);
 506                 return ret;
 507
 508 Ideally you should simulate errors to test all exit paths.
 509
 510
 511 8) Commenting
 512 -------------
 513
 514 Comments are good, but there is also a danger of over-commenting.  NEVER
 515 try to explain HOW your code works in a comment: it's much better to
 516 write the code so that the **working** is obvious, and it's a waste of
 517 time to explain badly written code.
 518
 519 Generally, you want your comments to tell WHAT your code does, not HOW.
 520 Also, try to avoid putting comments inside a function body: if the
 521 function is so complex that you need to separately comment parts of it,
 522 you should probably go back to chapter 6 for a while.  You can make
 523 small comments to note or warn about something particularly clever (or
 524 ugly), but try to avoid excess.  Instead, put the comments at the head
 525 of the function, telling people what it does, and possibly WHY it does
 526 it.
 527
 528 When commenting the kernel API functions, please use the kernel-doc format.
 529 See the files Documentation/kernel-documentation.rst and scripts/kernel-doc
 530 for details.
 531
 532 The preferred style for long (multi-line) comments is:
 533
 534 .. code-block:: c
 535
 536         /*
 537          * This is the preferred style for multi-line
 538          * comments in the Linux kernel source code.
 539          * Please use it consistently.
 540          *
 541          * Description:  A column of asterisks on the left side,
 542          * with beginning and ending almost-blank lines.
 543          */
 544
 545 For files in net/ and drivers/net/ the preferred style for long (multi-line)
 546 comments is a little different.
 547
 548 .. code-block:: c
 549
 550         /* The preferred comment style for files in net/ and drivers/net
 551          * looks like this.
 552          *
 553          * It is nearly the same as the generally preferred comment style,
 554          * but there is no initial almost-blank line.
 555          */
 556
 557 It's also important to comment data, whether they are basic types or derived
 558 types.  To this end, use just one data declaration per line (no commas for
 559 multiple data declarations).  This leaves you room for a small comment on each
 560 item, explaining its use.
 561
 562
 563 9) You've made a mess of it
 564 ---------------------------
 565
 566 That's OK, we all do.  You've probably been told by your long-time Unix
 567 user helper that ``GNU emacs`` automatically formats the C sources for
 568 you, and you've noticed that yes, it does do that, but the defaults it
 569 uses are less than desirable (in fact, they are worse than random
 570 typing - an infinite number of monkeys typing into GNU emacs would never
 571 make a good program).
 572
 573 So, you can either get rid of GNU emacs, or change it to use saner
 574 values.  To do the latter, you can stick the following in your .emacs file:
 575
 576 .. code-block:: none
 577
 578   (defun c-lineup-arglist-tabs-only (ignored)
 579     "Line up argument lists by tabs, not spaces"
 580     (let* ((anchor (c-langelem-pos c-syntactic-element))
 581            (column (c-langelem-2nd-pos c-syntactic-element))
 582            (offset (- (1+ column) anchor))
 583            (steps (floor offset c-basic-offset)))
 584       (* (max steps 1)
 585          c-basic-offset)))
 586
 587   (add-hook 'c-mode-common-hook
 588             (lambda ()
 589               ;; Add kernel style
 590               (c-add-style
 591                "linux-tabs-only"
 592                '("linux" (c-offsets-alist
 593                           (arglist-cont-nonempty
 594                            c-lineup-gcc-asm-reg
 595                            c-lineup-arglist-tabs-only))))))
 596
 597   (add-hook 'c-mode-hook
 598             (lambda ()
 599               (let ((filename (buffer-file-name)))
 600                 ;; Enable kernel mode for the appropriate files
 601                 (when (and filename
 602                            (string-match (expand-file-name "~/src/linux-trees")
 603                                          filename))
 604                   (setq indent-tabs-mode t)
 605                   (setq show-trailing-whitespace t)
 606                   (c-set-style "linux-tabs-only")))))
 607
 608 This will make emacs go better with the kernel coding style for C
 609 files below ``~/src/linux-trees``.
 610
 611 But even if you fail in getting emacs to do sane formatting, not
 612 everything is lost: use ``indent``.
 613
 614 Now, again, GNU indent has the same brain-dead settings that GNU emacs
 615 has, which is why you need to give it a few command line options.
 616 However, that's not too bad, because even the makers of GNU indent
 617 recognize the authority of K&R (the GNU people aren't evil, they are
 618 just severely misguided in this matter), so you just give indent the
 619 options ``-kr -i8`` (stands for ``K&R, 8 character indents``), or use
 620 ``scripts/Lindent``, which indents in the latest style.
 621
 622 ``indent`` has a lot of options, and especially when it comes to comment
 623 re-formatting you may want to take a look at the man page.  But
 624 remember: ``indent`` is not a fix for bad programming.
 625
 626
 627 10) Kconfig configuration files
 628 -------------------------------
 629
 630 For all of the Kconfig* configuration files throughout the source tree,
 631 the indentation is somewhat different.  Lines under a ``config`` definition
 632 are indented with one tab, while help text is indented an additional two
 633 spaces.  Example::
 634
 635   config AUDIT
 636         bool "Auditing support"
 637         depends on NET
 638         help
 639           Enable auditing infrastructure that can be used with another
 640           kernel subsystem, such as SELinux (which requires this for
 641           logging of avc messages output).  Does not do system-call
 642           auditing without CONFIG_AUDITSYSCALL.
 643
 644 Seriously dangerous features (such as write support for certain
 645 filesystems) should advertise this prominently in their prompt string::
 646
 647   config ADFS_FS_RW
 648         bool "ADFS write support (DANGEROUS)"
 649         depends on ADFS_FS
 650         ...
 651
 652 For full documentation on the configuration files, see the file
 653 Documentation/kbuild/kconfig-language.txt.
 654
 655
 656 11) Data structures
 657 -------------------
 658
 659 Data structures that have visibility outside the single-threaded
 660 environment they are created and destroyed in should always have
 661 reference counts.  In the kernel, garbage collection doesn't exist (and
 662 outside the kernel garbage collection is slow and inefficient), which
 663 means that you absolutely **have** to reference count all your uses.
 664
 665 Reference counting means that you can avoid locking, and allows multiple
 666 users to have access to the data structure in parallel - and not having
 667 to worry about the structure suddenly going away from under them just
 668 because they slept or did something else for a while.
 669
 670 Note that locking is **not** a replacement for reference counting.
 671 Locking is used to keep data structures coherent, while reference
 672 counting is a memory management technique.  Usually both are needed, and
 673 they are not to be confused with each other.
 674
 675 Many data structures can indeed have two levels of reference counting,
 676 when there are users of different ``classes``.  The subclass count counts
 677 the number of subclass users, and decrements the global count just once
 678 when the subclass count goes to zero.
 679
 680 Examples of this kind of ``multi-level-reference-counting`` can be found in
 681 memory management (``struct mm_struct``: mm_users and mm_count), and in
 682 filesystem code (``struct super_block``: s_count and s_active).
 683
 684 Remember: if another thread can find your data structure, and you don't
 685 have a reference count on it, you almost certainly have a bug.
 686
 687
 688 12) Macros, Enums and RTL
 689 -------------------------
 690
 691 Names of macros defining constants and labels in enums are capitalized.
 692
 693 .. code-block:: c
 694
 695         #define CONSTANT 0x12345
 696
 697 Enums are preferred when defining several related constants.
 698
 699 CAPITALIZED macro names are appreciated but macros resembling functions
 700 may be named in lower case.
 701
 702 Generally, inline functions are preferable to macros resembling functions.
 703
 704 Macros with multiple statements should be enclosed in a do - while block:
 705
 706 .. code-block:: c
 707
 708         #define macrofun(a, b, c)                       \
 709                 do {                                    \
 710                         if (a == 5)                     \
 711                                 do_this(b, c);          \
 712                 } while (0)
 713
 714 Things to avoid when using macros:
 715
 716 1) macros that affect control flow:
 717
 718 .. code-block:: c
 719
 720         #define FOO(x)                                  \
 721                 do {                                    \
 722                         if (blah(x) < 0)                \
 723                                 return -EBUGGERED;      \
 724                 } while (0)
 725
 726 is a **very** bad idea.  It looks like a function call but exits the ``calling``
 727 function; don't break the internal parsers of those who will read the code.
 728
 729 2) macros that depend on having a local variable with a magic name:
 730
 731 .. code-block:: c
 732
 733         #define FOO(val) bar(index, val)
 734
 735 might look like a good thing, but it's confusing as hell when one reads the
 736 code and it's prone to breakage from seemingly innocent changes.
 737
 738 3) macros with arguments that are used as l-values: FOO(x) = y; will
 739 bite you if somebody e.g. turns FOO into an inline function.
 740
 741 4) forgetting about precedence: macros defining constants using expressions
 742 must enclose the expression in parentheses. Beware of similar issues with
 743 macros using parameters.
 744
 745 .. code-block:: c
 746
 747         #define CONSTANT 0x4000
 748         #define CONSTEXP (CONSTANT | 3)
 749
 750 5) namespace collisions when defining local variables in macros resembling
 751 functions:
 752
 753 .. code-block:: c
 754
 755         #define FOO(x)                          \
 756         ({                                      \
 757                 typeof(x) ret;                  \
 758                 ret = calc_ret(x);              \
 759                 (ret);                          \
 760         })
 761
 762 ret is a common name for a local variable - __foo_ret is less likely
 763 to collide with an existing variable.
 764
 765 The cpp manual deals with macros exhaustively. The gcc internals manual also
 766 covers RTL which is used frequently with assembly language in the kernel.
 767
 768
 769 13) Printing kernel messages
 770 ----------------------------
 771
 772 Kernel developers like to be seen as literate. Do mind the spelling
 773 of kernel messages to make a good impression. Do not use crippled
 774 words like ``dont``; use ``do not`` or ``don't`` instead.  Make the messages
 775 concise, clear, and unambiguous.
 776
 777 Kernel messages do not have to be terminated with a period.
 778
 779 Printing numbers in parentheses (%d) adds no value and should be avoided.
 780
 781 There are a number of driver model diagnostic macros in <linux/device.h>
 782 which you should use to make sure messages are matched to the right device
 783 and driver, and are tagged with the right level:  dev_err(), dev_warn(),
 784 dev_info(), and so forth.  For messages that aren't associated with a
 785 particular device, <linux/printk.h> defines pr_notice(), pr_info(),
 786 pr_warn(), pr_err(), etc.
 787
 788 Coming up with good debugging messages can be quite a challenge; and once
 789 you have them, they can be a huge help for remote troubleshooting.  However
 790 debug message printing is handled differently than printing other non-debug
 791 messages.  While the other pr_XXX() functions print unconditionally,
 792 pr_debug() does not; it is compiled out by default, unless either DEBUG is
 793 defined or CONFIG_DYNAMIC_DEBUG is set.  That is true for dev_dbg() also,
 794 and a related convention uses VERBOSE_DEBUG to add dev_vdbg() messages to
 795 the ones already enabled by DEBUG.
 796
 797 Many subsystems have Kconfig debug options to turn on -DDEBUG in the
 798 corresponding Makefile; in other cases specific files #define DEBUG.  And
 799 when a debug message should be unconditionally printed, such as if it is
 800 already inside a debug-related #ifdef section, printk(KERN_DEBUG ...) can be
 801 used.
 802
 803
 804 14) Allocating memory
 805 ---------------------
 806
 807 The kernel provides the following general purpose memory allocators:
 808 kmalloc(), kzalloc(), kmalloc_array(), kcalloc(), vmalloc(), and
 809 vzalloc().  Please refer to the API documentation for further information
 810 about them.
 811
 812 The preferred form for passing a size of a struct is the following:
 813
 814 .. code-block:: c
 815
 816         p = kmalloc(sizeof(*p), ...);
 817
 818 The alternative form where struct name is spelled out hurts readability and
 819 introduces an opportunity for a bug when the pointer variable type is changed
 820 but the corresponding sizeof that is passed to a memory allocator is not.
 821
 822 Casting the return value which is a void pointer is redundant. The conversion
 823 from void pointer to any other pointer type is guaranteed by the C programming
 824 language.
 825
 826 The preferred form for allocating an array is the following:
 827
 828 .. code-block:: c
 829
 830         p = kmalloc_array(n, sizeof(...), ...);
 831
 832 The preferred form for allocating a zeroed array is the following:
 833
 834 .. code-block:: c
 835
 836         p = kcalloc(n, sizeof(...), ...);
 837
 838 Both forms check for overflow on the allocation size n * sizeof(...),
 839 and return NULL if that occurred.
 840
 841
 842 15) The inline disease
 843 ----------------------
 844
 845 There appears to be a common misperception that gcc has a magic "make me
 846 faster" speedup option called ``inline``. While the use of inlines can be
 847 appropriate (for example as a means of replacing macros, see Chapter 12), it
 848 very often is not. Abundant use of the inline keyword leads to a much bigger
 849 kernel, which in turn slows the system as a whole down, due to a bigger
 850 icache footprint for the CPU and simply because there is less memory
 851 available for the pagecache. Just think about it; a pagecache miss causes a
 852 disk seek, which easily takes 5 milliseconds. There are a LOT of cpu cycles
 853 that can go into these 5 milliseconds.
 854
 855 A reasonable rule of thumb is to not put inline at functions that have more
 856 than 3 lines of code in them. An exception to this rule are the cases where
 857 a parameter is known to be a compiletime constant, and as a result of this
 858 constantness you *know* the compiler will be able to optimize most of your
 859 function away at compile time. For a good example of this later case, see
 860 the kmalloc() inline function.
 861
 862 Often people argue that adding inline to functions that are static and used
 863 only once is always a win since there is no space tradeoff. While this is
 864 technically correct, gcc is capable of inlining these automatically without
 865 help, and the maintenance issue of removing the inline when a second user
 866 appears outweighs the potential value of the hint that tells gcc to do
 867 something it would have done anyway.
 868
 869
 870 16) Function return values and names
 871 ------------------------------------
 872
 873 Functions can return values of many different kinds, and one of the
 874 most common is a value indicating whether the function succeeded or
 875 failed.  Such a value can be represented as an error-code integer
 876 (-Exxx = failure, 0 = success) or a ``succeeded`` boolean (0 = failure,
 877 non-zero = success).
 878
 879 Mixing up these two sorts of representations is a fertile source of
 880 difficult-to-find bugs.  If the C language included a strong distinction
 881 between integers and booleans then the compiler would find these mistakes
 882 for us... but it doesn't.  To help prevent such bugs, always follow this
 883 convention::
 884
 885         If the name of a function is an action or an imperative command,
 886         the function should return an error-code integer.  If the name
 887         is a predicate, the function should return a "succeeded" boolean.
 888
 889 For example, ``add work`` is a command, and the add_work() function returns 0
 890 for success or -EBUSY for failure.  In the same way, ``PCI device present`` is
 891 a predicate, and the pci_dev_present() function returns 1 if it succeeds in
 892 finding a matching device or 0 if it doesn't.
 893
 894 All EXPORTed functions must respect this convention, and so should all
 895 public functions.  Private (static) functions need not, but it is
 896 recommended that they do.
 897
 898 Functions whose return value is the actual result of a computation, rather
 899 than an indication of whether the computation succeeded, are not subject to
 900 this rule.  Generally they indicate failure by returning some out-of-range
 901 result.  Typical examples would be functions that return pointers; they use
 902 NULL or the ERR_PTR mechanism to report failure.
 903
 904
 905 17) Don't re-invent the kernel macros
 906 -------------------------------------
 907
 908 The header file include/linux/kernel.h contains a number of macros that
 909 you should use, rather than explicitly coding some variant of them yourself.
 910 For example, if you need to calculate the length of an array, take advantage
 911 of the macro
 912
 913 .. code-block:: c
 914
 915         #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
 916
 917 Similarly, if you need to calculate the size of some structure member, use
 918
 919 .. code-block:: c
 920
 921         #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f))
 922
 923 There are also min() and max() macros that do strict type checking if you
 924 need them.  Feel free to peruse that header file to see what else is already
 925 defined that you shouldn't reproduce in your code.
 926
 927
 928 18) Editor modelines and other cruft
 929 ------------------------------------
 930
 931 Some editors can interpret configuration information embedded in source files,
 932 indicated with special markers.  For example, emacs interprets lines marked
 933 like this:
 934
 935 .. code-block:: c
 936
 937         -*- mode: c -*-
 938
 939 Or like this:
 940
 941 .. code-block:: c
 942
 943         /*
 944         Local Variables:
 945         compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
 946         End:
 947         */
 948
 949 Vim interprets markers that look like this:
 950
 951 .. code-block:: c
 952
 953         /* vim:set sw=8 noet */
 954
 955 Do not include any of these in source files.  People have their own personal
 956 editor configurations, and your source files should not override them.  This
 957 includes markers for indentation and mode configuration.  People may use their
 958 own custom mode, or may have some other magic method for making indentation
 959 work correctly.
 960
 961
 962 19) Inline assembly
 963 -------------------
 964
 965 In architecture-specific code, you may need to use inline assembly to interface
 966 with CPU or platform functionality.  Don't hesitate to do so when necessary.
 967 However, don't use inline assembly gratuitously when C can do the job.  You can
 968 and should poke hardware from C when possible.
 969
 970 Consider writing simple helper functions that wrap common bits of inline
 971 assembly, rather than repeatedly writing them with slight variations.  Remember
 972 that inline assembly can use C parameters.
 973
 974 Large, non-trivial assembly functions should go in .S files, with corresponding
 975 C prototypes defined in C header files.  The C prototypes for assembly
 976 functions should use ``asmlinkage``.
 977
 978 You may need to mark your asm statement as volatile, to prevent GCC from
 979 removing it if GCC doesn't notice any side effects.  You don't always need to
 980 do so, though, and doing so unnecessarily can limit optimization.
 981
 982 When writing a single inline assembly statement containing multiple
 983 instructions, put each instruction on a separate line in a separate quoted
 984 string, and end each string except the last with \n\t to properly indent the
 985 next instruction in the assembly output:
 986
 987 .. code-block:: c
 988
 989         asm ("magic %reg1, #42\n\t"
 990              "more_magic %reg2, %reg3"
 991              : /* outputs */ : /* inputs */ : /* clobbers */);
 992
 993
 994 20) Conditional Compilation
 995 ---------------------------
 996
 997 Wherever possible, don't use preprocessor conditionals (#if, #ifdef) in .c
 998 files; doing so makes code harder to read and logic harder to follow.  Instead,
 999 use such conditionals in a header file defining functions for use in those .c
1000 files, providing no-op stub versions in the #else case, and then call those
1001 functions unconditionally from .c files.  The compiler will avoid generating
1002 any code for the stub calls, producing identical results, but the logic will
1003 remain easy to follow.
1004
1005 Prefer to compile out entire functions, rather than portions of functions or
1006 portions of expressions.  Rather than putting an ifdef in an expression, factor
1007 out part or all of the expression into a separate helper function and apply the
1008 conditional to that function.
1009
1010 If you have a function or variable which may potentially go unused in a
1011 particular configuration, and the compiler would warn about its definition
1012 going unused, mark the definition as __maybe_unused rather than wrapping it in
1013 a preprocessor conditional.  (However, if a function or variable *always* goes
1014 unused, delete it.)
1015
1016 Within code, where possible, use the IS_ENABLED macro to convert a Kconfig
1017 symbol into a C boolean expression, and use it in a normal C conditional:
1018
1019 .. code-block:: c
1020
1021         if (IS_ENABLED(CONFIG_SOMETHING)) {
1022                 ...
1023         }
1024
1025 The compiler will constant-fold the conditional away, and include or exclude
1026 the block of code just as with an #ifdef, so this will not add any runtime
1027 overhead.  However, this approach still allows the C compiler to see the code
1028 inside the block, and check it for correctness (syntax, types, symbol
1029 references, etc).  Thus, you still have to use an #ifdef if the code inside the
1030 block references symbols that will not exist if the condition is not met.
1031
1032 At the end of any non-trivial #if or #ifdef block (more than a few lines),
1033 place a comment after the #endif on the same line, noting the conditional
1034 expression used.  For instance:
1035
1036 .. code-block:: c
1037
1038         #ifdef CONFIG_SOMETHING
1039         ...
1040         #endif /* CONFIG_SOMETHING */
1041
1042
1043 Appendix I) References
1044 ----------------------
1045
1046 The C Programming Language, Second Edition
1047 by Brian W. Kernighan and Dennis M. Ritchie.
1048 Prentice Hall, Inc., 1988.
1049 ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).
1050
1051 The Practice of Programming
1052 by Brian W. Kernighan and Rob Pike.
1053 Addison-Wesley, Inc., 1999.
1054 ISBN 0-201-61586-X.
1055
1056 GNU manuals - where in compliance with K&R and this text - for cpp, gcc,
1057 gcc internals and indent, all available from http://www.gnu.org/manual/
1058
1059 WG14 is the international standardization working group for the programming
1060 language C, URL: http://www.open-std.org/JTC1/SC22/WG14/
1061
1062 Kernel CodingStyle, by greg@kroah.com at OLS 2002:
1063 http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/