TransWikia.com

Mathematica style guide?

Mathematica Asked by faysou on December 3, 2020

I encountered this site today https://code.google.com/p/google-styleguide/ regarding the programming style in some languages. What would be best programming practices in Mathematica, for small and large projects ?

7 Answers

I think this is a very relevant question as I think it is agreed standard that having "a" coding styleguide for every project where several people write code is a very good (inevitable?) thing. It also seems to be agreed that it is more important to have a styleguide/standard than how excatly that looks like. I also am convinced that especially for Mathematica there are many details which should be handled differently for different kinds of projects and teams.

Thus instead of giving just an example of another style convention I think it makes more sense to write up a list of things that such a guidline could/should address. It would then be a second step to fill these entries with content (or probably avoid some) and probably every team/project wants to have their own details. I would prefer to not fill in specific suggestions for each entry here (too much danger of nonagreement), if people think it would make sense to work on a "mathematica stack exchange users" suggestion there is the other wiki answer from Szabolcs which could be used for that. Of course such a list will never be complete, and for some entries it might be open to debate whether they are relevant at all. I made this list a community wiki and invite everyone to contribute. My suggestion is to not delete entries which one thinks are not relevant but only give some pro/con arguments for them.

Use of Tools

It might make sense to make requirements about which tools to use or not use, there a plenty of possibilities to write, develop, document and test mathematica code. It certainly is good to have a convention about that. Possible decisions include:

  • use of frontend, workbench, text-editors, other IDEs (e.g. the Mathematica IDEAS plughin) for code development
  • use of internal or external tools to write/run tests
  • use of version control system and which
  • use of external tools for e.g. documentation

of course not all of these are independent, it is known that notebooks are not working welll together with version control systems, so making use of the latter might influence the decision about whether to use the frontend (or more precisely notebook files for code) or not...

File/Code Organisation

Use of File Formats

  • use of notebooks or packages for source code
  • use of notebooks or other formats for documentation
  • file formats for data that is relevant for the project (e.g. csv vs. excel)

Organization of Project/Source-Code Directory

  • define directory layout and which content should go where
  • modularisation of code:
    • how much content per file: one function/symbol definition per file,
    • how many lines are typically acceptable per function, per file,...
    • under which conditions are exceptions from the above acceptable?
    • use of extra directories vs. just extra package files for subpackages
    • use and naming of public/private contexts for subpackages
  • use of Protect and other Attributes for symbols.

Naming Conventions

Directory/File Names

  • require restrictions so that package files can be loaded with Needs
  • uppercase/camelcase/... conventions for directories and filenames
  • use of "-","_", " ",... in (non-package) filenames
  • use of file extensions, upper-/lower-case

Symbol Names

  • upper vs. lower CamelCase, allow/suggest just lower case
  • allow non-ascii characters in symbol names or not? if yes, restrict to subset like e.g. greek letters?
  • make naming depend on symbol purpose and content? If yes:
    • use verbs for symbols used as functions, nouns for symbols used as variables
    • use of singular vs. plural for lists (number[[idx]] vs. numbers[[idx]]), or other conventions as numberArray[[x]]
    • conventions for e.g. variables used as loop counters, flags, ...
    • use of mathematica like xxxQ functions vs. isXxx as used in many other languages
    • use a leading $ to indicate use of a global variable.
    • all uppercase names for constant (wide use in other languages, but does anyone use that in Mathematica?)
    • allow single letter symbol names or not

Option Names

all of the conventions made for symbol names need to be made here, not necessary with the same outcome. Additionally:

  • use of strings vs. symbols for option names

Documentation

  • prefer inline documentation with (**) or extra text cells/lines before/after relevant (function) definitions
  • require usage messages, probably at least stubs for auto completion
  • have more detailed explanation in extra files (e.g. mathematical background, preliminary experiments etc.)

Code Layout

Use of Shortcuts, Parentheses and Such

Mathematica code could theoretically be written in FullForm and a team with a strong lisp background might actually prefer that. But it is full of shortcuts and many of them help to make code more readable, but with exagerated use of shortcuts Mathematica code can look like perl oneliner contest examples which would make good comic curse strings. It certainly makes sense to give some guidelines about use of such shortcuts:

  • avoid or prefer shortcuts in general?
  • white- and blacklists for shortcuts
  • define conditions under which shortcuts are to be used. (e.g. I often use /@ when the resulting expression fits in a line and no additional parenthesis are required but otherwise I prefer an explicit Map with my standard convention for indenting and linebreaks).
  • it often makes sense to write parentheses even when they are not strictly necessary, so it might be relevant to define when paretheses are allowed/required/forbidden or to be replaced by code which doesn't need them (e.g. ()& vs. Function[]).

Line Breaks and Indenting

  • where to put line breaks
    • for function definitions put linebreak after := or not
    • extra linebreak before closing ] and } or not
  • where to put spaces, where not
    • after , in list of arguments
    • inbetween operators like +, -, =
  • use standard form cells with automatic indentation or input form cells / pure text with manual indentation
  • how much indentation
  • use tabs or spaces for indentation

Constructs Preference/Shunning

Mathematica is a very "rich" language and there are litteraly hunderts of ways to achieve the same thing. It might make sense to require certain standard solutions or preferences of certain constructs to help team members to easier understand other members code, e.g.:

  • looping constructs: e.g. favour Do vs. For, favour non-indexing constructs like Map and Scan vs. their indexing counterparts Table / Do
  • preferences of "paradigms" e.g. pattern matching vs. functional vs. procedural styles. e.g.: Replace[result,$Failed:>(Message[...];Throw[...]) vs. showMessageIfFailed[result]; vs. If[result===$Failed,Message[...]]
  • use of pure functions (many of them nested are hard to read/understand)
  • f=Function[x,x^2] vs. f=#^2& vs. f[x]:=x^2
  • restrict use of symbols to those available to certain Mathematica versions.
  • object/data representation: Association, Dataset, list of rules (and again: symbol or string keys?), matrix/list with positional meaning, custom head denoting an object, ManagedLibraryExpression

Correct answer by Albert Retey on December 3, 2020

Let me try with a few simple ("obvious"?) style guidelines I try to follow:

  • Use meaningful names that are spelled out or that use widely-adopted abbreviations from the field of application.
  • Begin names with lower-case letters (except when they're going into a Package for others' use) and then use camelCasing.
  • Avoid nesting functions too deeply with use of brackets; instead, try to use the built-in special input forms, e.g., /@ for Map, @@ for Apply, prefix form with @, and (when it's semantically appropriate) postfix form with //.
  • Intermix text cells containing documentation with input cells containing code.
  • Define ::usage strings for functions to be used by others, or for functions whose syntax you may readily forget.

Answered by murray on December 3, 2020

Everyone will have their own preferences about coding style. This is especially true for Mathematica, as most work done in this language is interactive, and until recently there was relatively little open collaboration between people that could have led to the development of standards. The existence of this site (Mathematica.SE) helped make a big progress in this area.

Let's try to collect a few guidelines which are already commonly followed in the online Mathematica community.

Naming things

  • The only characters allowed in symbol names are alphanumeric characters and $. This naturally leads to using Camel Case for names.

  • When developing packages meant to be used by others, use fully spelt out, descriptive names for public symbols.

  • When doing interactive work, use only names starting with lowercase, e.g. findAllRoots.

  • When writing packages, use only capitalized names for public symbols, e.g. FindAllRoots. However, for symbols private to the package use lowercase.

  • Start the names of constants or flags with a $ sign. This is typically used for global variables that control in some way how the system works, e.g. $MaxExtraPrecision.

When there's more than one way to write something

  • Instead of f[g[h[x]]], write f@g@h[x] for readability. Instead of f[{a,b,c}], write f@{a,b,c}.

  • If you have a purely stylistic choice between = and :=: use = for variable definitions and := for function definitions.

  • When evaluation has side effects, prefer DownValues to OwnValues. I.e., use randomNumber[] := RandomReal[] instead of randomNumber := RandomReal[].

  • For procedural loops, prefer Do over For

Answered by Szabolcs on December 3, 2020

In the excellent responses above, I find that one of my favourite guidelines is missing. It concerns the case when one has to apply one function after another.

Instead of writing

f[g[h[k[q[x]]]]]

write either

x // (q /* k /* h /* g /* f)

or

RightComposition[q, k, h, g, f][x]

I prefer this method because it is reminiscent of the pipeline operator in F#. And triple-clicking has never been a good friend of mine.

Answered by Shredderroy on December 3, 2020

I found Roman Maeder' package template useful, as a general setup guide. (More comprehensive than any setup I actually use.) From Programming in Mathematica 3rd Ed. (1996), page 290.

enter image description here

Answered by Chris Degnen on December 3, 2020

Not a general Answer, but I haven't seen such a thing in any language else. What I have found using a 1080p Monitor with the Mathematica Benchmark is: Formatting every function by indending every argument and place the commata on the first level, e.g.

If[ Head@list[[1]] == Symbol,
    foo[]
,
    If[ Head@list[[1,2]] == Symbol,
        bar[]
    ,
        foobar[]
    ]
]

improves the readability heavenly. The codebase I am currently working on is heavily oversmarted, i.e. way too long single lines. Especially since there is no clearly visible "else" I find it very important to at least give the second comma from "if" its own line.

Answered by Kevin Groenke on December 3, 2020

Separate Styles and Options from Functional Code

Almost nothing was said about Styles and Options so I think I will do that now.

Basis

In all modern languages it is a best practice to separate presentation and behavior of interface elements from content. In web development presentation is controlled with cascading stylesheets (CSS) and behavior with JavaScript (JS) which were early on expressed inline or embedded throughout web pages (HTML) making them appear extremely cluttered and clumsy to work with. This is analogous to how Mathematica currently embeds styles and options throughout functional code. As web pages became more dynamic and grew much larger it became essential to separate presentation and behavior from content by intelligently placing CSS and JS in separate files and linking them to HTML files with a few simple commands at the top. We can't do that in Mathematica (we can but we shouldn't) however we can do the next best thing -- place all styles and options in cells apart from the rest of the code by defining hooks and handlers linking them. This accomplishes three main things...

  • It makes the dependent code more readable and styles and options easier to find when debugging or adjustments are needed.
  • It helps in keeping style directives and options consistent.
  • It practices DRY resulting in writing less code and lighter files.

This is probably not a great strategy for smaller projects which are about a full page or less as using hooks and handlers may cause some bloat. However it is still great practice for becoming proficient at this and for setting up smaller projects that are over time expected to grow or refactored.

Implementing a Strategy

1. Don't Be Clever

Before I suggest a solution I wish to stress just because you can do something it doesn't mean that you should. Do not change the default set options of native functions especially when other developers with various backgrounds or levels of experiences may need to access your code. Too much can go wrong by just forgetting you did that. There is a better way as shall be discussed shortly. Options can also be borrowed between functions but such code usually appears entangled (violates KISS) making it very difficult to read or debug. So I don't recommend this either even though it practices DRY.

2. Separate Presentation and Behavior from Content

The following uses Grid as an example.

My solution is to specify all options using variables then they can be inserted inside a function in place of the actual options.

gridops = Sequence[op1->spec1, op2->spec2, ...]
Grid[{row1, row2,...},gridops]

Sometimes you will have many output structures that have common option specs but with minor differences. You can do one of two things. Trail unique option specs (op3 and op4) after placement of the options variable. This should be the only exception to the rule of separating presentation from content.

gridops = Sequence[op1->spec1, op2->spec2, ...]
Grid[{row1, row2,...},gridops,op3->spec3,op4->spec4]

Or you can create a function to handle unique option specs as parameters. Just one caveat. You may need to wrap the options function in an Evaluate when placed in your dependent function as is necessary with Grid and others like Plot.

gridOps[spec1_,spec2_]:=Sequence[op1->spec1, op2->spec2,...common ops here]
Grid[{row1, row2,...},Evaluate[gridOps[spec1,spec2]]]

Similarly can be done wherever Style is used. For my grid headers I might wrap text in the top row of cells with Style and place common style specs in a variable. If you have unique styles you can create a function to handle them or trail them as was done with Options.

headerstyle = Sequence[12,"Helvetica", Bold]
Style["string",headerstyle,...trailing unique styles]

3. Collect All Option and Style Handlers in One Place

Group all Styles directives and Option specs handlers into a single cell at the top of file placed so that they load before dependencies. This is analogous to how CSS and JavaScript are loaded in web pages. If there are an overabundance of handlers use several cells keeping related styles and option definitions together. Of course there is no need to reminder user that it might be a good idea to set those cells as Initialization Cells.

4. Go Further With Theming

Most projects demand setting only a few options or style directives that are consistently sprinkled in a lot of places. Instead of repeating the same specs define those specs in alias variables (a new feature in CSS does this) and insert aliases everywhere they occur including the aforementioned styles and options handlers made in points 2 and 3. These would then be collected in a cell higher up from any dependent code. Doing this would enable making quick and consistent changes to behavior or theme from just one place in your file. Your hierarchy might look like below. As before it might be a good idea to set cells to Initialization Cell.

(* TOP CELL *)
(* theme aliases goes here *)
bgcolor=Darker[Blue];
fontcolor=White;
fontfam="Helvetica";
fontsize=12;
font=Italic;
spaces={4,4};
align=Center;
frame=All;
output=Traditional;
method=Automatic;

(* NEXT CELL *)
(* handlers from points 2 and 3 *)
(* may be injected with aliases from point 4 *)

(* ALL OTHER CELLS *)
(* all other code *)
(* injected with hooks and aliases *)
(* from points 2 - 4 *)

Your Opinion Matters

What are your thoughts on this? Do you already do something similar? What are your best practices?

Answered by Jules Manson on December 3, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP