Good Scientific Software Practices, through Solitons

Introduction

Software engineering is now an integral part of science. As mathematicians and physicists, we’re pretty good at understanding algorithms, and even implementing them. But to ensure our code is easy to use, sharable, reproducible, tested and reliable we need to do a lot more than implement an algorithm. Modern software development involves continuous integration, linters, typers, LSPs, code coverage, automatically-generated docs, and more!

This course will focus on these “soft skills” of software. As such, we’ll be spending a lot of time setting up our project and development environment and thinking about our workflow. Then once we get to coding itself, I’m not really going to give you much advice or help you to understand the code itself. Hopefully it’s fairly self-explanatory and you can use your favorite search engine or LLM to understand the code. Try sharing a piece of code from the course with an LLM and ask some questions about it. Or ask it to improve the code.

These notes are designed to be self contained. You can start the course RIGHT NOW. You can suggest edits and raise issues on the GitHub page.

0.1 Plan for the course in Krakow

We have four sessions. In each one, I’ll give a brief introduction to the topic, then you will follow the notes and try and get some code working. If you’d like to spend this time simply working on your existing code: go for it!

By the end of the course, we’ll have made a small package to create a one-dimensional soliton and apply a flow to it. This is not scientifically revolutionary, but the code we make will have many good practices built into it. It will have tests and documentation. It’ll be readable, reproducible and published openly on GitHub. With this structure, I hope you can use this project as a base for future work.

So that everyone has the same experience, we’ll be using Python. If you’d prefer to use a different language, go for it! You’ll have to translate the Python code/concepts here into your favorite language, but that shouldn’t be too hard. Julia, C++ and Matlab are all reasonable choices. We will use Python because it’s quite easy to get started with and it’s the most popular (and employable) language in the world. Python is traditionally thought to be slow, and it can be. But if you know how it works it can be just as fast as any other language. I also recommend writing the code in VSCode or PyCharm.

0.2 Detailed Plan

  1. Before the course:
  • Installation
  1. Monday
  • Make a Project
  • Git and GitHub
  1. Tuesday
  • Source and scripts
  • Functions, classes and methods <– coding only begins here!!
  • Lots of code!
  1. Thursday
  • Compute derivatives and the energy
  • Plot a soliton
  • Save/Load a soliton
  1. Friday
  • Gradient flow
  1. Bonus
  • Tests
  • Documentation
  • Optimization
  • Linters, formatters and LSPs
  • Continuous Integration
  • Recommended Reading

1 Installation

1.1 The terminal

The terminal is a way to interact with your computer using text commands. If you’re using Mac or Linux you have a terminal installed already. It’s called Terminal. If you have Windows, you have Powershell. So that everyone has a similar experience, I encourage you to install Windows Terminal (there are installation instructions on its GitHub page)

Open your terminal and type “ls” (short for “list”) then press enter. This will display all the files in your currently directory. If you want to change directories type “cd” followed by the name of the directroy you want to go to.

1.2 uv (to install python + packages)

Python’s biggest asset (and liability) is that people write packages for it. There are over 600,000 packages available on PyPi. These packages usually depend on each other, and it can get overwhelming keeping track. As such, we’ll use a installation/package manager called uv. This will install python and the packages for us, and help initialise our project. To install uv follow the instructions on uv’s website. Check it works by opening a Terminal (on Mac or Linux) and typing uv.

Once you have uv, you can run python in the terminal by typing uv run python. The first time you do this uv might install Python. So it might take a little minute.

If uv doesn’t work

If uv doesn’t work, you’ll need to manage your virtual environments packages yourself. We’ll do this using Python’s venv (for (v)itrual (env)ironment). The venv is a little safe bubble where we’ll just install our current project and its dependencies (more). We first need to create a new venv. This will create a folder wherever you are in the Terminal, which will be where your venv is stored. I usually keep them in ~/.venvs (~ means “home”)

python -m venv path/to/where/you/keep/your/venvs/sig_numerics

To use it, we must activate it

source path/to/where/you/keep/your/venvs/sig_numerics/bin/activate

Hopefully the left hand side of your Terminal prompt now says your venv’s name. It is now activate. You can add python packages to your venv by installing them using pip (short for “(p)ip (i)nstall (p)ackages”). E.g. to add numpy to your venv: first, make sure your venv is activated, then run

pip install numpy

You can check which packages are installed using pip list.

In the course, whenever you read uv add {package_name} please use pip install {package_name} instead.

1.3 An editor

A good text editor really helps coding. A good one allows you to search through codebases, auto-complete code, and more. I recommend VSCode because it’s free and has really good Python support. Some people love PyCharm. Please install one.

Both VScode and PyCharm have lots of extensions. For VSCode, please install the Python extension.

1.4 Git

Check that you have git installed by typing ‘git’ then pressing enter in your Terminal. It’s hopefully already installed. If not, please install it This is a good moment to make a GitHub account too.

2 Make a project

We’re going to make a Python Package. First choose a name. I’ll be using solitons1d. Find a place you want to store your project and go to that directory in Terminal. Initialise a new python package by typing e.g. uv init --package solitons1d in Terminal and pressing enter. Go have a look in the folder (either using your File Explorer or typing ls into Terminal). There should be four files inside the folder. uv has made these:

  • README.md This file is what a new user will first read when they encounter your project. It should contain a description of the project, installation instructions and anything else you’d like. It’s written in markdown, which isn’t too far from plain text (https://www.markdownguide.org/).
  • pyproject.toml This file contains all the information about your project. Its name, version, description etc. Add some of your own details if you’d like. This file will also keep track of which other packages your package depends on once we add some.
  • uv.lock Keeps the detailed information about package dependencies.
  • src/solitons1d. A folder which will contain your package.

(if you are not using uv, please make these files.)

Using PyCharm?

We need to tell PyCharm which python interpreter we want to use, so that it’s aware of our solitons1d package. Open PyCharm and choose ‘Open’ and open the soliton1d folder. PyCharm should figure out what’s going on. However, you might need to tell it which interpreter you want to use. To do this, you should use the one that uv has created. This is located at solitons1d/.venv. If you’re unsure - ask me!

Ok! Let’s open the project in VSCode. The easiest way to do this is in terminal. We’re going to do something slightly fancy that will make sense later, and open it with uv. Do this by navigating to the project folder in Terminal then typing uv run code .. The dot at the end means “this folder here”. code is the shortcut for VSCode and uv run means “When you open VSCode, make it aware of which python environment is being used at the moment”. More on environments soon!

uv run code . didn’t work

We need to tell your terminal that “code” means “open VSCode”. We only need to do this once. Follow the instructions here. They are: open VSCode like a normal application. Open the Command Pallet by pressing Cmd+Shift+P (mac) or ctrl+shift+P (windows). Type ‘shell command’, and select the “Shell Command: Install ‘code’ command in PATH command.” Then restart your terminal. Now try running uv run code . again.

VSCode should have opened, and you should see the project directory at the left hand side. There should also be a Terminal at the bottom. If there isn’t you can put it there using View -> Terminal. From the terminal we can run python by entering uv run python, then import our package using import solitons1d. Quit python by running exit(). Note: we won’t use python in the terminal much.

If that didn’t error - amazing!! You’ve just set up your developer workflow. I think doing this is actually one of the hardest parts of coding. I’ve shown you my workflow which is fairly standard; but everyone has their own. You can read thousands of articles and YouTube videos (just search neovim for python on YouTube to really start going down the rabbit hole).

2.1 Virtual Enviroments

I’ve been sweeping the following topic under the rug but it’s very important to understand: virtual environments. Because python has so many packages and their dependencies are complex, it’s considered good behaviour to make a different environment for different projects. So you might have one environment for when you do statistics, one for when you do heavy numerical stuff and another for plotting. uv takes this to the extreme: it makes a new virtual environment every time you run python. When we ran uv run code . it make a folder called .venv containing the information about the environment. Then it opened VSCode and told it that we’re using this environment. This means VSCode is aware of which packages we’ve installed. This is gonna be super helpful later.

Now we’ll add some packages to our project - this tells uv that when we create the virtual environment it should install these packages. In the terminal, navigate to your project folder. Add the package numpy by running uv add numpy. This will add numpy to the project by adding it to the pyproject.toml file. While we’re at it, please also install ipykernel by running uv add ipykernel.

3 Git and GitHub

Git is version control software. It keeps track of changes you make in the code, so you can easily revert to previous versions. It’s really helpful when you work on a big project with several contributors. When a bug gets introduced you can usually immediately figure out who caused it!

GitHub is run by Microsoft. It’s where you can publish your software. It uses Git to keep track of changes etc.

The idea here is to have one copy of your project on your local computer, and one copy of your project on GitHub. We’ll then link the two copies together, so changes you make locally can get applied to the remote (GitHub) copy. Then GitHub takes care of a lot of nice things, which we’ll get on to later.

There are lots of ways to link your local and remote repos. For more info, try: https://github.com/git-guides/git-init .

It can get a bit messy at the start here, so please follow these instructions carefully:

We’ll make a GitHub repository (repo) and link it to this folder. Go to github.com, make an account then create a new repository. DON’T DO ANYTHING except add the repository name, then press “Create repository”. Towards the bottom of the next page you should see the heading “…or push an existing repository from the command line”, with some commands to run. Something like:

git remote add origin https://github.com/chrishalcrow/solitons1D.git
git branch -M main
git push -u origin main

Run these three commands from your local repo folder in Terminal. They say:

  • “Link the git in this folder to this github repository”
  • “Create the primary ‘branch’ of the repo, called main”
  • “Push (put, or add) everything from the folder to the repo on github.com”

If that worked: great! You’re project is now online! If the last one error’d, that’s ok. We’ll get that sorted soon.

Note that GitHub is useful for any project work you’re doing. This website is hosted on GitHub. Even better: GitHub builds this website using its compute power. And it’s free - very cool!

4 Sources and scripts

As we build our package, we should keep a separation between the “source” and “scripts”. The source is our package itself. This will contain the logic which can do things and that you want to share with other people. Functions like compute_energy or make_soliton. But when we use the package to do science, we’d write a script. The script will import functions from the source and apply them to a specific case. A script might make a soliton, apply a gradient flow, compute the energy, then plot the result and save the figure as a pdf.

To make this distinction obvious, we’ll keep two folders in the project folder. One called “src” that uv already created for us (short for source - this shortening is conventional) and one called “scripts”. In the “src” there should already be another folder called the name of your package. Inside this there is a file called __init__.py which tells Python that this folder contains a “module”. Let’s create another files in the src/{project_name} folder called soliton.py, which will contain the code about the soliton, Also add a script file called play.ipynb in the scripts folder. Overall, my project folder looks like this:

solitons1D/
    pyproject.toml
    README.md
    scripts/
        play.ipynb
    src/
        solitons1D/
            __init__.py
            soliton.py
    uv.lock

To test that everything has worked. Open play.ipynb and enter import solitons1d. You might need to tell the notebook which python interpreter (vritual enviroment) to use. It should suggest using the one that uv has created, which is located at solitons1d/.venv.

5 Functions, classes and methods

Ok - let’s start coding!!

5.1 Functions

Let’s make a function. Functions take some input and return some output. Let’s make a create_profile function which will generate a profile for a 1D soliton. For now, we’ll just return an array of 0s:

import numpy as np

def create_profile(num_grid_points):
    profile = np.zeros(num_grid_points)
    return profile

I’ll write this out in English:

  • “Import the package called numpy and call it np, which is how we’ll be able to access it.”
  • “Define a function called create_profile, which will accept the argument num_grid_points.”
  • “Use the function zeros from the numpy library. (This will create an array of zeros). Assign this array of zeros to the variable profile.”
  • “Return the variable profile. This return signals the end of the function.

There are several improvements we can make to code immediately, we can: add type information about the arguments and returned parameters; add a docstring describing the function.

import numpy as np

def create_profile(num_grid_points: int) -> np.array:
    """
    Creates a profile function for a grid with 
    `num_grid_points` points.

    Parameters
    ----------
    num_grid_points: int
        Number of grid points

    Returns
    -------
    profile: np.array
        Generated profile function 
    """

    profile = np.zeros(num_grid_points)
    return profile

(notes on choosing variable/function names: https://inventwithpython.com/beyond/chapter4.html)

Let’s use this function. We’re going to use play.ipynb. This is a Jupyter notebook. To use it, we need to install another package to our environment. In the terminal, navigate to the project folder, type “uv add ipykernel” then press enter. Now open play.ipynb in VSCode.

A Jupyter notebook allows you to run little snippets of code in a “cell”. We need to import our function from our package, then run it. Because we are developing and editing our package, we need to tell Jupyter to listen out for changes to the package. We can do do this by running the following “magic” commands before anything else. Put these in a cell and run the cell (by pressing ctrl+enter or shift+enter or pressing the play button at the side of the cell):

%load_ext autoreload
%autoreload 2

Now write this code in another cell

from solitons1d.soliton import create_profile
profile = create_profile(100)
profile

and execute this (either by pressing the “play” button next to the cell or pressing shift+enter). The final line of the cell gets displayed underneath it. So we should see a large array of 0s.

5.2 Classes and methods

Functions are fine. But classes are the heart of object-orientated programming. Think of a class as an abstract definition like a Group (from Group Theory). Then we can create objects which conform to the class definition (like we can create \(D_4\), a specific example of a group). In src/soliton.py, let’s make a class which will represent a soliton. When you make a specific instance of a class (object), it is “initialised”. During initialisation, we can choose what information to store in the object.

class Soliton():

    def __init__(self, num_grid_points, grid_spacing):
         self.num_grid_points = num_grid_points
         self.grid_spacing = grid_spacing
         self.profile = create_profile(num_grid_points)

In English:

  • “Define a class called Soliton”
  • “Define the initialisation method, which will accept two numbers, the number of grid points and the grid spacing.”
  • “store the number of grid points”
  • “store the grid spacing”
  • “Make a profile function using the create_profile function and store it in profile.

Now, back in the play.ipynb notebook, we can make a Soliton object:

from solitons1d.soliton import Soliton
my_soliton = Soliton(100,0.1)
print(my_soliton.num_grid_points)
  • “from the solitons1d package, import the class Soliton”
  • “use the Soliton class to initialise a soliton with 100 grid points and 0.1 grid spacing called my_soliton
  • “take the soliton I just made, and get the number of grid points. Print this.”

Great! Hopefully you can see that it’s helpful to have an object that contains some information.

5.3 Methods

Even more helpful: write class-specific functions that can change what’s in the object. These are called “methods”. They’re similar to functions but are designed to only be applied to the class. These are defined in the class definition itself. Let’s create a method which will (eventually) compute the energy of the soliton. The Soliton class gets updated to the following:

class Soliton():

    def __init__(self, num_grid_points, grid_spacing):
         self.num_grid_points = num_grid_points
         self.grid_spacing = grid_spacing
         self.profile = create_profile(num_grid_points)

    def compute_energy(self):
         total_energy = np.sum(self.profile)
         total_energy *= self.grid_spacing
         self.energy = total_energy

Now we have a method that takes in the profile function, sums it up, multiplies this value by the grid spacing, stores the value in self.energy then returns this value. We can use it in a script as follows:

my_soliton.compute_energy()
print(my_soliton.energy)

We’ve now got the basic building block that we’ll use for the rest of the course.

5.4 Upload to GitHub

Now that we have a few functions, methods, and classes: let’s sync our local folder with our GitHub repository. You can do this using the Terminal (), but VSCode has some great Git integration. So let’s use that. Find the “Source Control” button in VSCode. When you press it, it should show a list of files that you have made changes to. The logic is:

  1. “Stage” the files you want to sync to GitHub. Do this by selecting the files you want to stage, right-clicking, then choosing “Stage Files”.
  2. Commit and Push the staged files to GitHub. Do this by first writing a message in the Message box. Maybe “Added Soliton class”. Then press the down arrow at the right hand side of the commit button, and click “Commit and Push”.

The first time you do this, you might need to Publish your branch. Every other time, it’s just the 2-step process above.

Now go to your github page (mine is github.com/chrishalcrow/solitons1d/) and see if your changes have appeared there.

6 Lots of code!

Let’s get starting coding! I’ll explain the overall structure of the codebase, then we’ll go through the actual code very quickly. In each subsection, I’ll get an overview then present the code.

Our centerpiece is the Soliton class. The Soliton will contain a Grid object and a Lagrangian object, keeping the numerical grid maintained and the mathematical details of the theory respectively. By doing this, each Soliton will know what theory is belongs to - handy. A Soliton object will contain methods allowing one to create an initial profile function, compute the energy and gradient flow.

Note: Another programmer might make this differently. Maybe having a Grid class is overkill. Maybe allowing the user to specify a generic Lagrangian is silly - you might just hardcode the Lagrangian you’d like in your Soliton object. Play around, debate, and figure out your style.

6.1 The Grid class

A user can create a grid by specifying the number of grid points and the grid spacing. The Grid will then compute a grid (saved as numpy array) and save the length of the grid.

Add this to soliton.py:

import numpy as np

class Grid:
    """
    A 1D grid.

    Parameters
    ----------
    num_grid_points : int
        Number of grid points used in grid.
    grid_spacing : float
        Spacing between grid points.

    Attributes
    ----------
    grid_length : float
        Total length of grid.
    grid_points : np.array[float]
        An array of grid points, of the grid.

    """

    def __init__(
        self,
        num_grid_points: int,
        grid_spacing: float,
    ):
        self.num_grid_points = num_grid_points
        self.grid_spacing = grid_spacing
        self.grid_length = (num_grid_points) * grid_spacing

        self.grid_points = np.arange(
            -self.grid_length / 2, self.grid_length / 2, grid_spacing
        )

Possible improvements:

  • Allow the user to specify the grid in other ways: by providing the start and end points, for example. Question: How do you make the code flexible to allow the user to do several different things. Hint: If num_grid_points is None:

Question:

  • What’s the difference between a Parameter and an Attribute?

Use. Make a grid as follow in a script:

from solitons1d.soliton import Grid
my_grid = Grid(200,0.1)

6.2 Make a Lagrangian class

The Lagrangian will keep track of the potential function. We won’t set up automatic differentiation, so we’ll also give the Lagrangian the derivative of the potential. We’ll also optionally allow the user to input the vacua of the theory. If they do, we’ll add a check that the derivative function of the vacua return 0. This class could also be where we define how many fields our theory has, any funny metric, and more. But let’s keep it simple for now.

class Lagrangian:
    """
    Used to represent Lagrangians of the form:
        L = - 1/2(dx_phi)^2 - V(phi)

    Parameters
    ----------
    V : function
        The potential energy function, must be a map from R -> R
    dV : function
        The derivative of the potential energy function, must be a map from R -> R
    vacua : list-like or None
        List of vacua of the potential energy.
    """

    def __init__(
        self,
        V: Callable[[float], float], # Yup - you can pass functions are argument in python!
        dV: Callable[[float], float],
        vacua: list | np.ndarray | None = None,  # np.ndarray is the type of a numpy array
    ):
        self.V = V
        self.dV = dV
        self.vacua = vacua

        if vacua is not None:
            for vacuum in vacua:
                # np.isclose does what it sounds like: are the values close?
                # That f"" is called an f-string, allowing you to add parameters to strings
                assert np.isclose(dV(vacuum), 0), (
                    f"The given vacua do not satisfy dV({vacuum}) = 0"
                )

Use:

from solitons1d.soliton import Lagrangian
import numpy as np

def phi4_V(x):
    return 0.5*np.pow(1-np.pow(x,2),2)

def phi4_dV(x):
    return 2*np.pow(x,3) - 2*x

phi4_lagrangian = Lagrangian(V=phi4_V, dV=phi4_dV, vacua=[-1,1])

Possible improvements:

  • Add an automatic differentiation package to compute dV from V.
  • Allow for multiple fields, different metric terms, time (harder - do you need to make a TimeGrid??).
  • Delete this and make it part of the Soliton class?? Is this an improvement, or not? Why?
  • Add library of common Lagrangians, so that the user doesn’t have to specify the potential every time.
  • Add a plotting function which plots the potential and its derivative.

Questions:

  • Why put an assert where we did? Better as a test?
  • Can you edit the example code so that the assert is triggered?

6.3 Update the Soliton class

Time to update the Soliton class to use the Grid and Lagrangian classes. When initialising the soliton, we’ll allow the user to specify an initial profile function: either by specifying a function or passing an array representing the initial profile. We’ll add a compute_energy method, and do something a bit odd: create another function called compute_energy_fast. Roughly: the method in the class strips out what we need from the object and passes this to compute_energy_fast: this second function doesn’t interact with the class at all and we can optimize the heck out of it later. This is THE KEY concept to get the benefits of: 1) Lovely user-friendly classes 2) FAST STUFF. For now, we won’t implement the energy function, but we will set up the machinery to do it later.


class Soliton:
    """
    A class describing a Soliton.

    Parameters
    ----------
    grid : Grid
        The grid underpinning the soliton.
    lagrangian : Lagrangian
        The Lagrangian of the theory supporting the soliton.
    initial_profile_function : None | function
        The initial profile function, must be from R -> R. Optional.
    initial_profile : None | array-like
        The initial profile function as an array. Optional.
    """

    def __init__(
        self,
        grid: Grid,
        lagrangian: Lagrangian,
        initial_profile_function: Callable[[float], float] | None = None,
        initial_profile: np.ndarray | None = None,
    ):
        self.grid = grid
        self.lagrangian = lagrangian

        self.profile = np.zeros(grid.num_grid_points)

        assert (initial_profile_function is None) or (initial_profile is None), (
            "Please only specify `initial_profile_function` or `profile_function`"
        )

        if initial_profile_function is not None:
            self.profile = create_profile(self.grid.grid_points, initial_profile_function)
        else:
            self.profile = initial_profile

        self.energy = self.compute_energy()

    def compute_energy(self):
        """Computes the energy of a soliton, and stores this in `Soliton.energy`."""

        energy = compute_energy_fast(
            self.lagrangian.V,
            self.profile,
            self.grid.num_grid_points,
            self.grid.grid_spacing,
        )
        self.energy = energy

def compute_energy_fast(V, profile, num_grid_points, grid_spacing):

    total_energy = 0
    return total_energy


def create_profile(
    grid_points: np.array,
    initial_profile_function: Callable[[np.array], np.array] | None = None,
) -> np.array:
    """
    Creates a profile function on a grid, from profile function `initial_profile_function`.

    Parameters
    ----------
    grid_points: Grid
        The x-values of a grid.
    initial_profile_function: function
        A function which accepts and returns a 1D numpy array

    Returns
    -------
    profile: np.array
        Generated profile function
    """

    profile = initial_profile_function(grid_points)
    return profile

Example code:

from solitons1d.soliton import Lagrangian, Grid, Soliton
import numpy as np

def phi4_V(x):
    return 0.5*np.pow(1-np.pow(x,2),2)

def phi4_dV(x):
    return 2*np.pow(x,3) - 2*x

phi4_lagrangian = Lagrangian(V=phi4_V, dV=phi4_dV, vacua=[-1,1])
my_grid = Grid(20,0.1)

my_soliton = Soliton(my_grid, phi4_lagrangian, initial_profile_function=np.tanh)

print(my_soliton.profile)

Improvements:

  • If a user passes an initial_profile, check it’s the right length.

Questions:

  • What are all these selfs about??
  • Can you trigger the assert by playing with the example code?
  • Before going on to the next section: try to write the compute_energy_fast function, or part of it.
  • We’ve edited the create_profile function to fit in with our class. How has it changed? Why?

You’ve now seen quite a lot of Python code. This might be a good time to read PEP8, the Zen of Python, and some tips about writing “Pythonic” code (https://inventwithpython.com/beyond/chapter6.html).

7 Formatters and linters (Optional)

Programmers are opinionated. Especially when it comes to formatting. Here’s a question: which looks better:

a = 1 + 1

or

a=1+1

Different people will have different answers. And if you have a project with many contributors you ideally want to consistent answers. Usually, a codebase defers these decisions to A formatter. We’ll use ruff. This can be installed using uv by running uv add tool ruff.

Ruff basically is a giant list of rules. (You can edit the rules you use by editing the ruff section of your pyproject.toml file: (more info)[https://docs.astral.sh/ruff/configuration/]). You can check if your code passes all these rules by running ruff check in solitons1d. Then it can automatically fix certain rule breaks if you run ruff format. Try it!

This works ok, but most people use ruff more interactively by installing the ruff extension in VSCode. Then ruffs rule checks will appear automatically as you type your code. Try installing it and see what happens.

ruff is also a linter. These are static code checkers which try to find simple bugs before you run anything. Once you’ve installed the ruff extension, try adding b = a + 1 somewhere. A red line appears under my variables, telling me that a is Undefined and that b is never used. Nice.

8 Derivatives and Energy

To compute the energy, and the gradient flow, we’ll need to compute derivatives. There are many ways to do this. The most common in the soliton field is using a finite difference method. Here, we implement the approximation

\[ \phi'(x) = \lim_{\epsilon \to 0}\frac{\phi(x+\epsilon) - \phi(x-\epsilon)}{2\epsilon} \]

on our rigid grid. This can be approximated in many ways. You get more accuracy if you use more points. But if your true function does change fast and your points are too widely spread you’ll hit numerical blow-ups. For solitons, I usually use a fourth-accuracy method. If we use i to denote the lattice point, the fourth order derivative approximates the derivative on the grid as follows:

\[ (\partial \phi)_i = \frac{1}{12}\phi_{i-2} - \frac{2}{3}\phi_{i-1} + \frac{2}{3}\phi_{i+1} - \frac{1}{12}\phi_{i+2} \]

You can find other accuracies on Wikipedia.

The big problem is the boundary. When we try to compute the derivative at the 0th point, it tries to access the -1th point, which doesn’t exist. Here are a few solutions:

  • Use Dirichlet boundary conditions. So keep the boundary fixed. Since you’re keeping it fixed, you don’t need to update it and hence you don’t need to compute the derivative there. Easy!
  • Use forward/backwards finite difference to approximate the boundary points.
  • Use Neumann boundary conditions. Keep the derivative equal to zero. If you do this, then at the boundary you can set \(\phi_{-1} = \phi_{1}\) and \(\phi_{-2} = \phi_2\), and you don’t need to worry.
  • Use (possibly shifted) periodic boundary conditions, so that \(\phi_{-1} = \phi_{N}\) and \(\phi_{-2} = \phi_{N-1}\).

We’re going to do the simplest: Dirichlet. A good exercise: implement something else.

So, we’ll write functions to compute the first and (while we’re at it) second derivatives of a function. (This is going to be a FAST function, so we don’t want it to interact with the class):

def get_first_derivative(
    phi: np.ndarray, 
    num_grid_points: int, 
    grid_spacing: float,
) -> np.ndarray:
    """
    For a given array, computes the first derivative of that array.

    Parameters
    ----------
    phi: np.ndarray
        Array to get the first derivative of
    num_grid_points: int
        Length of the array
    grid_spacing: float
        Grid spacing of underlying grid

    Returns
    -------
    d_phi: np.ndarray
        The first derivative of `phi`.

    """
    d_phi = np.zeros(num_grid_points)
    for i in np.arange(num_grid_points)[2:-2]:
        d_phi[i] = (phi[i - 2] - 8 * phi[i - 1] + 8 * phi[i + 1] - phi[i + 2]) / (
            12.0 * grid_spacing
        )

    return d_phi


def get_second_derivative(
    phi: np.ndarray, 
    num_grid_points: int, 
    grid_spacing: float,
) -> np.ndarray:
    """
    For a given array, computes the first derivative of that array.

    Parameters
    ----------
    phi: np.ndarray
        Array to get the first derivative of
    num_grid_points: int
        Length of the array
    grid_spacing: float
        Grid spacing of underlying grid

    Returns
    -------
    d_phi: np.ndarray
        The first derivative of `phi`.

    """
    ddV = np.zeros(num_grid_points)
    for i in np.arange(num_grid_points)[2:-2]:
        ddV[i] = (
            -phi[i - 2] + 16 * phi[i - 1] - 30 * phi[i] + 16 * phi[i + 1] - phi[i + 2]
        ) / (12.0 * np.pow(grid_spacing, 2))

    return ddV

With these derivative functions, computing the energy is easy. We just estimate the integral as a sum,

\[ E = \int \mathcal{E}(x) dx \approx (\Delta x)\sum_i \mathcal{E}_i \, , \]

like so:

def compute_energy_fast(
    V: Callable[[float], float],
    profile: np.array, 
    num_grid_points: int, 
    grid_spacing: float,
) -> float:
    """
    Computes the energy of a Lagrangian of the form
        E = 1/2 (d_phi)^2 + V(phi)

    Parameters
    ----------
    V: function
        The potential energy function
    profile: np.ndarray
        The profile function of the soliton
    num_grid_points: int
        Length of `profile`
    grid_spacing: float
        Grid spacing of underlying grid
    """
    dx_profile = get_first_derivative(profile, num_grid_points, grid_spacing)

    kin_eng = 0.5 * np.pow(dx_profile, 2)
    pot_eng = V(profile)

    tot_eng = np.sum(kin_eng + pot_eng) * grid_spacing

    return tot_eng

Example code, following on from last time

my_soliton.compute_energy()
print(my_soltion.energy)

Improvements:

  • Allow for any type of boundary conditions
  • Allow for any kind of kinetic energy

Questions: - You could get num_grid_points by computing len(profile). How would this improve your code? How would this make your code worse?

9 Plot a Soliton

We’re now making and computing solitons - let’s take a look at what we’re plotting. The most popular plotting package in Python is matplotlib. We need to install it by navigating to our Project folder in Terminal and running uv add matplotlib. In matplotlib, you’ve got a Figure which contains Axes. You can plot stuff on an axis, and adjust ticks and titles. Our code will look like:

import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot(x,y)
ax.set_title("hello!")

And it makes sense and WE LOVE IT. When you Google matplotlib you’ll find lots of code that looks like plt.plot(x,y); plt.show(). WE HATE THIS!!!! (ask me about it later)

So, the plan is to make a plotting method of the Soliton class that will return a matplotlib figure. It’s nice to play with plotting in a Jupyter notebook. Here’s my attempt at a reasonable plotting method. This is a method of the class, so should be indented in the class:

    def plot_soliton(self):
        """Makes a plot of the profile function of your soliton"""

        fig, ax = plt.subplots()
        ax.plot(self.grid.grid_points, self.profile)
        ax.set_title(f"Profile function. Energy = {self.energy:.4f}")

        return fig

Code example, following on from earlier:

# if you run just this first line in a Jupyter cell, it should display a figure
fig = my_soliton.plot_soliton()
fig.savefig("my_lovely_soliton.pdf")

10 Save/load a Soliton

When saving any numerical objects, you should save enough data so that it can be reconstructed. For our Soliton class this is actually quite complex because we need to save the Lagrangian and the Grid too; and the Lagrangian includes generic functions. When the object is this complex we should probably save it as a folder. The folder should contain enough information to reconstruct the object. Ideally, any data which can be should be in a “human readable” format like json or yaml.

We’ll go a bit further and make a save/load function for the Grid and Lagrangian objects too. These will also be contained in a folder. Overall the Soliton export will look like

my_soliton/
    profile.npz
    metadata.json
    grid/
        metadata.json
    lagrangian
        metadata.json
        potential.pkl

This might be a bit of an overkill for this project.

10.1 Save/Load a Grid

The Grid is an easy object to save and load because it can be reconstructed from just the number of lattice points and the lattice spacing. We first add a save method to the Grid class. This will involve some new “standard” packages: json and pathlib. The standard packages are packages with basic Python, so you don’t need to add them using uv.

# add near the top of the file
import json
from pathlib import Path

# add to Grid class
    def save(
        self,
        folder_name: str | Path,
    ):
        """
        Saves a `Grid` object at `folder_name`.
        """
        metadata = {
            "num_grid_points": self.num_grid_points,
            "grid_spacing": self.grid_spacing
        }

        # make the folder a Path if it is a string
        folder = Path(folder_name)
        folder.mkdir(exist_ok = True)

        # this overwrites any existing metadata.json file
        with open(folder / "metadata.json", "w") as f:
            json.dump(metadata, f)

The loading function is a function, not a method (why?). It should read the json file, make a Grid object and return it

def load_grid(folder_name: str | Path):
    """
    Loads the `Grid` object at `folder_name`.
    """
    folder = Path(folder_name)
    metadata_path = folder / "metadata.json"

    assert metadata_path.is_file(), f"Could not find Grid `metadata.json` file in {folder}."
    
    with open(metadata_path, "r") as f:
        grid_metadata = json.load(f)
    
    # the ** "unpacks" the dictionary into a series of arguments
    grid = Grid(**grid_metadata)
    return grid

Using it:

from solitons1d.soliton import load_grid

my_grid = Grid(100,0.1)
my_grid.save("my_lovely_grid")
my_loaded_grid = load_grid("my_lovely_grid")

10.2 Save/Load a Lagrangian

The Lagrangian is more complicated: we need a save the potential and the derivative of the potential, which are generic functions. We’ll do this using the pickle format, which can save any Python object (note: you could use this to save the full Soliton object).

import pickle as pkl

# add to Lagrangian class

    def save(
        self,
        folder_name: str | Path,
    ):
        """
        Saves a `Lagrangian` object at `folder_name`.
        """
        metadata = {
            "V": self.V,
            "dV": self.dV,
            "vacua": self.vacua,
        }

        # make the folder a Path if it is a string
        folder = Path(folder_name)
        folder.mkdir(exist_ok = True)

        with open(folder / "metadata.pkl", "wb") as f:
            pkl.dump(metadata, f)

# add to main body

def load_lagrangian(folder_name: str | Path):
    """
    Loads the `Lagrangian` object at `folder_name`.
    """
    folder = Path(folder_name)
    metadata_path = folder / "metadata.pkl"

    assert metadata_path.is_file(), f"Could not find Lagrangian `metadata.json` file in {folder}."
    
    with open(metadata_path, "rb") as f:
        lagrangian_metadata = pkl.load(f)
    
    # the ** "unpacks" the dictionary into a series of arguments
    lagrangian = Lagrangian(**lagrangian_metadata)
    return lagrangian

Questions:

  • Save a Lagrangian and a Grid, and take a look in the folder. Why might you prefer json to pickle?

10.3 Save/Load a Soliton

Now we want to save the Soliton, including any profile funciton we’ve made. Since we compute the energy on instantiation we’ll also save this as a property. Let’s go:

    def save(
        self,
        folder_name: str | Path,
    ):
        """
        Saves a `Soliton` object at `folder_name`.
        """

        folder = Path(folder_name)
        folder.mkdir(exist_ok = True)

        grid_folder =  "./grid"
        lagrangian_folder = "./lagrangian"

        metadata = {
            'grid_folder': str(grid_folder),
            'lagrangian_folder': str(lagrangian_folder),
        }

        properties = {
            'energy': self.energy
        }

        self.grid.save(grid_folder)
        self.lagrangian.save(lagrangian_folder)

        with open(folder / "metadata.json", "w") as f:
            json.dump(metadata, f)

        with open(folder / "properties.json", "w") as f:
            json.dump(properties, f)
        
        # use `numpy`s save function to save the profile array
        np.save("profile", self.profile)


def load_soliton(folder_name):
    """
    Loads the `Lagrangian` object at `folder_name`.
    """
    folder = Path(folder_name)
    metadata_path = folder / "metadata.json"

    assert metadata_path.is_file(), f"Could not find Grid `metadata.json` file in {folder}."
    
    with open(metadata_path, "r") as f:
        soliton_metadata = json.load(f)

    grid_folder = soliton_metadata.get("grid_folder")
    grid = load_grid(grid_folder)

    lagrangian_folder = soliton_metadata.get("lagrangian_folder")
    lagrangian = load_lagrangian(lagrangian_folder)

    profile = np.load("profile.npy")
    
    soliton = Soliton(grid = grid, lagrangian=lagrangian, initial_profile=profile)

    return soliton

Exercise: save a soliton, and load it.

Note: We’ve written more code to save and load our objects than we will to gradient flow them. Ouch.

11 Gradient flow

Exercise: Write a gradient flow function. This should implement the following mathematical evolution:

\[ \dot{\phi} = - \frac{\delta E}{\delta \phi} \]

\[ \dot{\phi} = \partial_x^2 \phi - \partial_\phi V[\phi] \]

The user should be able to use the method follows:

print(f"Energy before flow: {my_soliton.energy}")
my_soliton.gradient_flow(steps=1000, dt=0.0001)
my_soliton.compute_energy()
print(f"Energy after flow: {my_soliton.energy}")

You’ll actually write two functions: a gradient_flow method in the Soltion class and a gradient_flow_fast function in the body of the code.

End of “core” course

12 Bonus

12.1 Timing and Optimization

Let’s make our functions FAST.

We’ll do this using numba. Numba is a big-fry package: Nvidia use it for their python implemtation of CUDA (their GPU library). Roughly, when we “numbafy” a function, when the complier first sees the function it compiles it into highly-optimized C code. But it can only do this to certain functions. Roughly: it likes for loops and numpy functions. It hates strings, dictionaries and dataframes. More read here:

Before we make the code fast, let’s see how fast it is now. We’ll do the timing using the perf_counter from the time standard library. To really see a difference in 1D code we need to make a silly example. Let’s make a giant grid, then compute the derivative lots of times using get_first_derivative, and time it:

from time import perf_counter
from solitons1d.soliton import get_first_derivative
import numpy as np

array_length = 100_000
random_big_array = np.random.random(array_length)

times = []
for _ in range(100):
    t1 = perf_counter()
    get_first_derivative(random_big_array, num_grid_points=array_length, grid_spacing=0.1)
    t2 = perf_counter()
    times.append(t2-t1)

print(np.median(times))
>>> 0.04206724999676226

So computing the derivative of a 100_000 long array took 0.04206724999676226 seconds.

To use numba we need to add it to our package (uv add numba) then “decorate” our get_first_derivative function using njit by doing the following in soliton.py:

from numba import njit

@njit
def get_first_derivative(
    ...

njit means “no-python just-in-time compliation”. More: “when the compiler first meets this function, compile it just as you use it. Try to make pure C++ code. In fact, if you can’t do this and still have some python in it: throw an error!”.

Restart your jupyter kernel, and rerun the timing script. I now get a time of 0.00010070799908135086 seconds, a speed-up of \(\times 420\)!!!!!

Now, the reason this worked so easily is because we set up the code structure so that numba would be easy to implement. Doing this from scratch is kinda difficult. Luckily, all our code is numba-fy-able. We just need to decorate lots of funcitons with @njit. I’ve added it to

  • compute_energy_fast
  • grad_flow_fast
  • get_first_derivative
  • get_second_derivative
  • compute_dE

Note: you need to @njit the Lagrangian’s functions too when you make it.

The other functions and methods don’t need to be fast.

numba also allows for parallelisation if we tell if which loops are parallelisable. ALL of ours are!! We can tell it that they are parallelisable by using a prange (parallel range) in the for loop and by telling the @njit decorator that parallelisation is allowed. This takes a little bit of modifying… The get_first_derivative function now looks like:

@njit(parallel=True)
def get_first_derivative(
    phi: np.ndarray, 
    num_grid_points: int, 
    grid_spacing: float,
) -> np.ndarray:
    d_phi = np.zeros(num_grid_points)
    for i in prange(num_grid_points-4):
        d_phi[i+2] = (phi[i] - 8 * phi[i + 1] + 8 * phi[i + 3] - phi[i + 4]) / (
            12.0 * grid_spacing
        )
    return d_phi

I get a x2 speed-up on my laptop (but to see the benefit clearly, I need to incease the lattice points to 10_000_000). Feel free to parallelise all your other fast functions too!

Now that things are getting more complicated, it’s at least nice that the complicated stuff is contained in the get_first_derivative function. This mess isn’t leaking into the Soliton class code, or anything like that. It’s gross but we’ve localised it.

Some more optimisation tips:

  • Allocating memory is expensive. So d_phi = np.zeros(num_grid_points) costs time. We can save this time by preallocating this memory. Maybe we can keep a copy of d_profile and dd_profile in the Soliton class.
  • Initialising the multi-threading happens when the for loop is called, and it expensive. Currently we do this once during get_first_derivative and again during dV(profile[i]) (if we turn this into a loop). It would be faster to only do this once. To do this our gradient flow code should look like
def compute_dE(...):
    dE = np.zeros(points)
    for i in prange(points):
        dd_at_point = get_second_derivative_at_point(i)
        dV_array = dV(profile[i])
        dE[i] = dd_at_point - dV_array

This is harder than it sounds, because get_second_derivative_at_point depends on which point you’re looking at. At the boundary, you have to do something different. There are a few creative solutions to deal with this. Think about it, and feel free to talk to me about them.

  • Don’t worry about little things. Modern compilers are smart. If you write something like
my_variable = 3
return my_variable

This is a bit stupid, because you assign a variable, then return it. It would be quicker to just return 3. Luckily, the compiler will say “awww, the human has set a variable equal to something then returned it on the next line. That’s so inefficient, I’ll just ignore that”. This happens all the time. Focus your efforts on readability and learning numba. ignore little smart things: the compiler will deal with these.

12.2 Refactoring

Refactoring is the practice of taking your already-written code and making it “better”. Better is hard to define: usually you want to organise it into a more logical structure, or reduce code-duplication as much as possible. I’m getting overwhelmed by the soliton.py file so I’m going to refactor so that the Grid and Lagrangian objects get their own files. Basically, we just move the Grid class and the load_grid function into a new file called grid.py and similar for the Lagrangian.

..

..

..

Ok, I did that.

If you run some code - it’ll break! This is because your imports currently assume that all your functions and classes are in solitons.py. And e.g. the Grid class relies on numpy which hasn’t been imported in that function.

The top’s of each file need updating. For grid.py, we should have

import json
from pathlib import Path
import numpy as np

for lagrangian.py we have:

import pickle as pkl
from pathlib import Path
from typing import Callable
import numpy as np

and soliton.py should contain:

import json
from typing import Callable
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np

# this means, from the grid file, import Grid
from .grid import Grid
from .lagrangian import Lagrangian

(Some tips on code smells: https://inventwithpython.com/beyond/chapter5.html If you find a code smell, some refactoring is probably in order. We have duplication in our load_* functions - a classic code smell.)

12.3 __init__

When we use other packages we do stuff like:

import numpy as np
np.tanh

But currently this syntax doesn’t work for our library. To make this work we need to expose our functions in the package initialisation script __init__.py. You do this by importing the functions and classes you’d like the user to be able to use in the file. We want the users to be able to make our three classes, and to load any of them, so we’ll expose these by modifying __init__.py as follows:

from .grid import Grid, load_grid
from .lagrangian import Lagrangian, load_lagrangian
from .soliton import Soliton, load_soliton

Now a user can do something like

import solitons1d as sol
my_grid = sol.Grid(100, 0.1)

12.4 Make a library of Lagrangians

There are a few common Lagrangians, and we could add them to our package so that users don’t have to define the potential and derivative of the potential themselves.

To do this we’ll build a dictionary for each theory, and store these in a dictionary of dictionaries. The dictionary will contain everything you need to build the Lagrangian. For example, the phi4 dictionary will look like:

{"V": phi4_V, "dV": phi4_dV, "vacua": [-1,1]}

Where we’ve seen the functions phi4_V and phi4_dV earlier in the course. I’ll make a new file called lagrangian_library.py and add the library to it. And a function which can make a Lagrangian from a string:

from numba import njit
import numpy as np

from .lagrangian import Lagrangian

@njit
def phi4_V(x):
    return 0.5*np.pow(1-np.pow(x,2),2)

@njit
def phi4_dV(x):
    return 2*np.pow(x,3) - 2*x

lagrangians = {
    "phi_4": {"V": phi4_V, "dV": phi4_dV, "vacua": [-1,1]},
}

list_of_lagrangians = list(lagrangians.keys())

def make_lagrangian_from_library(lagrangian_string):
    
    lagrangian_dict = lagrangians.get(lagrangian_string)
    assert lagrangian_dict is not None, f"{lagrangian_string} is not in the Lagrangian"\
    "Library. You can view the supported Lagrangians in `solitons1d.list_of_lagrangians`"

    lagrangian = Lagrangian(**lagrangian_dict)
    return lagrangian

Now we want to the user to be able to use this when they make a Soliton. There are a few ways to do this - I’m going to let the user enter a string to the lagrangian argument of Soliton, so that the user can run something like:

my_phi4_soliton = Soliton(lagrangian="phi_4", grid=my_grid)

We can alter the bit of the soliton.py code where the Lagrangin is added to something like.

        if isinstance(lagrangian, str):
            self.lagrangian = make_lagrangian_from_library(lagrangian)
        elif isinstance(lagrangian, Lagrangian):
            self.lagrangian = lagrangian
        else:
            raise TypeError("`lagrangian` must be a string or a Lagrangian object")

You’ll need to import the make_lagrangian_from_library function. And if you’d like to expose the list_of_lagrangians list, you need to add it the __init__.py file.

Try adding sine-Gordon theory and phi6 Lagrangians to the library.

We can help the user out by exposing the list of possible Lagrangian’s in the type information of the lagrangian argument. We do this using the Literal type. We need to import Literal from the typing library by adding from typing import Literal to the top of the soltion.py file. Then we update the typing information as follows:

        lagrangian: Lagrangian | Literal["phi_4", "sine_Gordon"],

Now when you start typing in Jupyter, the list of possible Lagrangian’s should pop up when entering the lagrangian argument.

12.5 Ideas to build on the codebase

These are all exercises:

  • Add time evolution. This will look a lot like gradient flow, but a bit more complicated.
  • Add Arrested Newton Flow.
  • Allow for gradient flow to keep going until a certain tolerance is reached (that is: keep flowing until the functional derivative of the energy is smaller than some tolerance).
  • Allow for any boundary conditions.
  • Allow for various orders of derivative.
  • Allow for multiple fields
  • Write code that finds the normal modes of the soliton
  • Write code that finds the one-loop correction of the soliton

12.6 Advice on making code and figures for papers

In this course, we’ve made a package to flow 1D solitons. And you might think that you’d only make a package for this kind of thing: something you do over and over again and that anyone can benefit from. I now believe the opposite: I make a package for every project I do. Each paper has at least one package. If the paper has distinct sections, I make a package for each. I use packages from an older paper in my newer paper packages. Packages packages packages.

This approach comes in especially handy when you’re making presentations, or figures. When I now make a plot, I first write a script (either a Jupyter notebook or a .py Python script) to generate the data. Then I save the data locally. When writing the script, if I write any code that seems like it would be helpful in the future, I put this into a package.

Once I have the data I make a different script to make the Figure. I find this separation between data generation and Figure making extremely helpful.

So, if I was going to make a Figure demonstrating that the accuracy of the energy improves as you decrease the grid spacing, I would first write a script that computes the energy for various grid spacings (as a .py script)

import numpy as np
import pandas as pd
import solitons1d as sol

grid_spacings = [0.4, 0.2, 0.1, 0.05]
energies = []

for grid_spacing in grid_spacings:

    grid_length = 10
    grid_points = round(10/grid_spacing)
    grid = sol.Grid(grid_points, grid_spacing)

    soliton = sol.Soliton(grid=grid, lagrangian='phi_4', initial_profile_function=np.tanh)

    # here I choose a million steps, but really I should use a tolerance (once implemented)
    soliton.grad_flow(dt=0.00001, num_steps=1_000_000)
    soliton.compute_energy()

    # add the computed energy to the list of energies
    energies.append(soliton.energy)


# Turn the data into a pandas DataFrame, then export to csv
data = pd.DataFrame(np.transpose(np.array([grid_spacings, energies])), columns=["grid_spacing", "energy"])
data.to_csv("spacings_vs_energies.csv")

…and once I have the data, I would then write a plotting script (probably as a Jupyer notebook):

import matplotlib.pyplot as plt
import pandas as pd

data = pd.read_csv("spacings_vs_energies.csv")

fig, ax = plt.subplots()
ax.plot(data['grid_spacing'], data['energy'])
ax.axline((0.05,4/3), (0.4,4/3),  color="black", linestyle=(0, (5, 5)))
ax.set_xlabel("Grid spacing")
ax.set_ylabel("Energy")
fig.savefig("energy_vs_grid_spacing.pdf")

12.7 Tests

Why are tests important? You would think it’s to check that our code does what we say it does. But actually the most important reason to make tests is so that you can confidently change and updated your code. Try to get into the habit of re-running your tests every time you make a non-trivial change to your code.

A “unit” test should check one, small, piece of your code and makes sure it does what you’re expecting. Ideally you are testing just one function or method.
Tests can also check that your code fails in the correct way. Ideally, every single line of code in run during your tests. Codebases get a “code coverage” score, which is the percentage of lines that are tested during a test run. Aim for 100%.

The most common testing tool in python is called pytest. For this, we can make a tests directory and create a test script in it. People structure this in different ways: we’ll make a directory in src/soltion1d called tests then make a python script called all_tests.py in this.

In all_tests.py, I’ll write a function which tests the first derivative function. (General rule: you don’t need to be as carefuly with typing and docstrings for tests. But it’s a good idea to write what you’re trying to test in the docstring, and how you’ll do it).

import numpy as np
from solitons1d.soliton import get_first_derivative

def test_first_derivative():
    """
    Computes the first derivative of the function f(x) = 2x and checks
    that it is 2.
    """

    lattice_spacing = 0.1
    linear_function = 2*np.arange(-1,1,lattice_spacing)
    first_derivative = get_first_derivative(linear_function, len(linear_function), lattice_spacing)

    # Don't include the boundary points, which will be zero
    assert np.allclose(first_derivative[2:-2], 2.0)

To run the test suit we need to add pytest to our package by running uv add tool pytest in Terminal, while cdd in the project. Then we can run uv run pytest src/solitons1d/tests/. I get the following output:

======================== test session starts =========================
platform darwin -- Python 3.13.2, pytest-8.4.0, pluggy-1.6.0
rootdir: /Users/christopherhalcrow/Work/fromgit/solitons1d
configfile: pyproject.toml
collected 1 items                                                                                                      

src/solitons1d/tests/all_test.py .     [100%]

========================= 1 passed in 0.41s ==========================

Now let’s make a test to check that constructing a Grid works as expected:

def test_grid_construction():
    """
    Makes a grid object and checks that data is propagated to the object.
    Then checks that an error is raised the number of lattice points is given as
    a float.
    """

    num_grid_points = 100
    grid_spacing = 0.2

    grid = Grid(num_grid_points=num_grid_points, grid_spacing=grid_spacing)

    # Check that arguments become attributes of the object
    assert grid.num_grid_points == num_grid_points
    assert grid.grid_spacing == grid_spacing

Exercise: write tests for everything! Ideally, you’d write tests as you go along.

12.8 Documentation

We’ll make a set of Documentation using a package called (sphinx)[https://www.sphinx-doc.org/en/master/index.html]. Later, we’ll get GitHub to automatically generate the docs every time that we make a commit. We’re really just going to follow (this tutorial)[https://www.sphinx-doc.org/en/master/tutorial/index.html]. Feel free to read that instead of this.

First of all, we need to add sphinx, which is a python package, to our package by running uv add sphinx. There are lots of ways to organise your docs, but we’ll let sphinx guide us through the process by running sphinx-quickstart docs in the Terminal (make sure you’re in the solitons1d folder). It will ask you some questions. My answers were: y, solitons1d, Chris Halcrow, 0.1, en.

Sphinx has now generated a docs folder in your project folder. We can “build” these docs by running sphinx-build -M html docs/source/ docs/build/ in Terminal. Doing this has built a website, located at docs/build/html/index.html. Go open it using your web browser!

To edit your Documentation, you need to edit your source files (in docs/source). The index.rst file generates the index.html file. RST stands for restructured text: it’s a simple text format. There’s a nice cheatsheet for it (https://sphinx-tutorial.readthedocs.io/cheatsheet/)[here]. Here’s an example of what you could put in index.rst.

Solitons1d documentation
========================

Here is the documentation for Solitons1d, made while doing the 
`Good Scientific Software Practices, through Solitons <https://chrishalcrow.github.io/SIG_soliton_numerics/>`_.

You can make and plot a soliton like so:

.. code-block:: python

   import solitons1d as sol
   import numpy as num_grid_points

   my_grid = sol.Grid(num_grid_points=100, grid_spacing=0.1)
   my_soliton = sol.Soliton(
      grid=grid,
      lagrangian="phi_4",
      initial_profile_function=np.tanh
   )

How to install
==============

Here are some installation instructions.

You can add more pages, or generate an API from your docstrings. Have fun!

12.9 Continuous Integration

The aim of this section is to get GitHub to run our tests, and build and host our documentation.