Contributing

Contributions to ObsPlus are welcomed and appreciated. Before proceeding please be aware of our code of conduct.

Getting setup

The following steps are needed to setup ObsPlus for development:

1. Clone ObsPlus

git clone https://github.com/niosh-mining/obsplus

2. Pull tags

Make sure to pull all of the latest git tags so the dynamic versioning provided by versioneer works.

NOTE: This step is important if some time has passed since ObsPlus has been cloned

git pull origin master --tags

3. Create a virtual environment (optional)

Create and activate a virtual environment so ObsPlus will not mess with the base (or system) python installation.

If you are using Anaconda:

conda create -n obsplus_dev obspy pandas pre-commit
conda activate obsplus_dev
cd obsplus

4. Install ObsPlus in development mode

pip install -e .[dev]

5. Setup precommit hooks

ObsPlus uses several pre-commit hooks to ensure the code stays tidy.

pre-commit install -f

Branching and versioning

We create new features or bug fixes in their own branches and merge them into master via pull requests. We may switch to a more complex branching model if the need arises.

If substantial new features have been added since the last release we will bump the minor version. If only bug fixes/minor changes have been made, only the patch version will be bumped. Like most python projects, we loosely follow semantic versioning in terms that we will not bump the major version until ObsPlus is stable.

Running the tests

The tests suite is run with pytest. Make sure your current directory is set to the cloned ObsPlus repo and you have followed the development setup instructions listed above. Invoke pytest from the command line:

pytest tests

Building the documentation

The documentation can be built using the script called “make_docs.py” in the scripts directory. If you have followed the instructions above all the required dependencies should be installed.

python scripts/make_docs.py

The docs can then be accessed by double clicking on the newly created html index at docs/_build/html/index.html.

General guidelines

ObsPlus uses Black and flake8 for code linting. If you have properly installed ObsPlus’ pre-commit hooks they will be invoked automatically when you make a git commit. If any complaints are raised simply address them and try again.

Use numpy style docstrings. All public code (doesn’t start with a _) should have a “full” docstring but private code (starts with a _) can have an abbreviated docstring.

ObsPlus makes extensive use of Python 3’s type hints. You are encouraged to annotate any public functions/methods with type hints.

Prefer pathlib.Path to strings when working with paths. However, when dealing with millions of files (e.g., ObsPlus’ indexers) strings may be preferred for efficiency.

Follow PEP8’s guidelines for imports.

Example functions

from typing import Optional, Union

import pandas as pd


# example public function
def example_func(df: pd.DataFrame, to_add: Optional[Union[int, float]]) -> pd.DataFrame:
    """
    A simple, one line explanation of what this function does.

    Additional details which might be useful, and are not limited to one line.
    In fact, they might span several lines, especially if the author of the
    docstring tends to include more details than needed.

    Parameters
    ----------
    df
        A description of this parameter
    to_add
        A description of this parameter

    Returns
    -------
    If needed, more information about what this function returns. You shouldn't
    simply specify the type here since that is already given by the type annotation.

    Examples
    --------
    >>> # Examples are included in the doctest style
    >>> import numpy as np
    >>> import pandas as pd
    >>>
    >>> df = pd.DataFrame(np.random(10))
    >>> out = example_func(df)
    """
    out = df.copy()
    if to_add is not None:
        out = out + to_add
    return out


# example private function
def _recombobulate(df, arg1, arg2):
    """
    A private function can have a simple (multi-line) snippet and doesn't need as
    much detail or type hinting as a public function.
    """

Working with dataframes

Column names should be snake_cased whenever possible.

Always access columns with getitem and not getattr (ie use df['column_name'] not df.column_name).

Prefer creating a new DataFrame/Series to modifying them inplace. Inplace modifications should require opting in (usually through an inplace key word argument).