STAT 39000: Project 5 — Fall 2021
Testing Python: part II
Motivation: Code, especially newly written code, is refactored, updated, and improved frequently. It is for these reasons that testing code is imperative. Testing code is a good way to ensure that code is working as intended. When a change is made to code, you can run a suite a tests, and feel confident (or at least more confident) that the changes you made are not introducing new bugs. While methods of programming like TDD (test-driven development) are popular in some circles, and unpopular in others, what is agreed upon is that writing good tests is a useful skill and a good habit to have.
Context: This is the second in a series of two projects that explore writing unit tests, and doc tests. In The Data Mine, we will focus on using pytest, and mypy, while writing code to manipulate and work with data.
Scope: Python, testing, pytest, mypy
Dataset(s)
The following questions will use the following dataset(s):
- 
/depot/datamine/data/apple/health/2021/*
Questions
| At this end of this project you will have only 3 files to submit: 
 Make sure that the output from running the cells is displayed and saved in your notebook before submitting. | 
Question 1
First, setup your workspace in Brown. Create a new folder called project05 in your $HOME directory. Make sure your apple_watch_parser package from the previous project is in your $HOME/project05 directory. In addition, create your new notebook, firstname-lastname-project05.ipynb in your $HOME/project05 directory. Great!
| An updated  Note that this updated package is not required by any means to complete this project. | 
Now, in $HOME/project05/apple_watch_parser/test_watch_data.py, modify the test_time_difference function to use parametrizing to test the time difference between 100 sets of times.
| The following is an example of how to generate a list of 100 datetimes 1 day and 1 hour apart.  | 
| An example of how to convert a datetime to a string in the same format our   | 
| See the very first example here for how to parametrize a test for a function accepting 2 arguments instead of 1. | 
| The   | 
| You do not need to manually calculate the expected result for each combination of datetime’s that you will pass to the  | 
Run the pytest tests from a bash cell in your notebook.
%%bash
cd $HOME/project05
python -m pytestRelevant topics: parametrizing tests
- 
Code used to solve this problem. 
- 
Output from running the code. 
Question 2
Read and understand the filter_elements method in the WatchData class. This is an example of a function that accepts another function as an argument. A term for functions that accept other functions as arguments are higher-order functions. Think about the filter_elements method, and list at least 2 reasons why this may be a good idea.
The docstring contains an explanation of the filter_elements method. In addition, the module provides a function called example_filter that can be used with the filter_elements method.
Use the example_filter function to filter the WatchData class, and print the first 5 results.
| Remember to import and use the package, make sure that the notebook is in the same directory as the   | 
| When passing a function as an argument to another function, you should not include the opening and closing parentheses in the argument. For example, the following is not correct. Why? Because the  | 
- 
Code used to solve this problem. 
- 
Output from running the code. 
Question 3
Write your own *_filter function in a Python code cell in your notebook (like example_filter) that can be used with the filter_elements method. Be sure to include a Google style docstring (no doctests are needed).
Does it work as intended? Print the first 5 results when using your filter.
- 
Code used to solve this problem. 
- 
Output from running the code. 
Question 4
In the previous project, we did not test out the filter_elements method in our WatchData class. Testing this method is complicated for two main reasons.
- 
The method accepts any function following a set of rules (described in our docstring) as an argument. This (a *_filterfunction) may not be something that is available after immediately importing theWatchDataclass — normally there wouldn’t be anexample_filterfunction in the module for you to use, as this would be a function that a user of the library would create for their own purposes.
- 
In order to be able to test the filter_elementsmethod, we would need a dataset that is similarly structured as the intended dataset (Apple Watch exports), that we know the expected output for, so we can test.
pytest supports writing fixtures that can be used to solve these problems.
To address problem (1):
- 
Remove the example_filterfunction from thewatch_data.pymodule, and instead, modify thetest_watch_data.pyfile and add theexample_filterfunction to thetest_watch_data.pymodule as apytestfixture. Read this and 2 or 3 following sections. In addition, see this stackoverflow answer to better understand how to create a fixture that is a function that can accept arguments.
| You may need to import lxml and other libraries in your   | 
| Why do we need to do something like the stackoverflow post describes? The reason is, by default,  | 
| In the example in the stackoverflow post, the  As a side note, sometimes helper functions (functions defined and used inside of another function) are called helper functions, and it is good practice to name them starting with an underscore — just like the  | 
| You can start by cutting the  | 
To address problem (2):
- 
Create a new test_datadirectory in yourapple_watch_parserpackage. So,$HOME/project05/apple_watch_parser/test_datashould now exist. Add/depot/datamine/data/apple/health/2021/sample.xmlto this directory, and rename it toexport.xml. So,$HOME/project05/apple_watch_parser/test_data/export.xmlshould now exist.sample.xmlis a small sample of the the watch data that we can use for out tests. It is small enough to be portable, yet is similar enough to the intended types of datasets that it will be a good way to test ourWatchDataclass and its methods. Since we renamed it toexport.xml, it will work with ourWatchDataclass.
- 
Create a test_filter_elementsfunction in yourtest_watch_data.pymodule. Use this library (already installed), to handle properly copying thetest_data/export.xmlfile to a temporary directory for the test. Examples 2 and 3 here will be particularly helpful.You may be wondering why we would want to use this library for our test rather than just hard-coding the path to our test files in our test function(s). The reason is the following. What if one of your functions had a side-effect that modified your test data? Then, any other tests you run using the same data would be tainted and potentially fail! Bad news. This package allows for a systematic way to first copy our test data to a temporary location, and then run our test using the data in that temporary location. In addition, if you have many test function that work on the same dataset, you can do something like the following to re-use the code over and over again. export_xml_decorator = pytest.mark.datafiles(...) @export_xml_decorator def test_1(datafiles): pass @export_xml_decorator def test_2(datafiles): passEach of the tests, test_1andtest_2, will work on the same example dataset, but will have a fresh copy of the dataset each time. Very cool!The decorator, @pytest.mar.datafiles()is expecting a path to the test data,export.xml. To get the absolute path to the test data,$HOME/project05/apple_watch_parser/test_data/export.xml, you can use thepathliblibrary.test_watch_data.pyimport watch_data # since watch_data.py is in the same directory as test_watch_data.py, we can import it directly from pathlib import Path # To get the path of the watch_data Python module this_module_path = Path(watch_data.__file__).resolve().parent print(this_module_path) # $HOME/project05/apple_watch_parser # To get the test_data folders absolute path, we could then do print(this_module_path / 'test_data') # $HOME/project05/apple_watch_parser/test_data # To get the test_data/export.xml absolute path, we could then do ...? # HINT: The answer to this question is _exactly_ what should be passed to the `@pytest.mark.datafiles()` decorator. @pytest.mark.datafiles(answer_here) def test_filter_elements(datafiles, example_filter_fixture): # replace example_filter_fixture with the name of your fixture function pass
Okay, great! Your test_watch_data.py module should now have 2 additional functions, "symbolically" something like this:
# from https://stackoverflow.com/questions/44677426/can-i-pass-arguments-to-pytest-fixtures
@pytest.fixture
def my_fixture():
  def _method(a, b):
    return a*b
  return _method
@pytest.mark.datafiles(answer_here)
def test_filter_elements(datafiles, my_fixture):
    passFill in the test_filter_elements function with at least 1 assert statements that tests the filter_elements function. It could be as simple as comparing the length of the output when using the example_filter function as our filter. test_data/example.xml should return 2 elements using our example_filter function as the filter.
| As a reminder, to run   | 
| If you get an error that says pytest.mark.datafiles isn’t defined, or something similar, do not worry, this can be ignored. Alternatively, if you add a file called  pytest.ini [pytest]
markers =
    datafiles: mark a test as a datafiles. | 
- 
Code used to solve this problem. 
- 
Output from running the code. 
Question 5
Create an additional method in the WatchData class in the watch_data.py module that does something interesting or useful with the data. Be sure to include a Google style docstring (no doctests are needed). In addition, write 1 or more pytest tests for your new method that uses fixtures. Make sure your test passes (you can run your pytest tests from a bash cell in your notebook).
If you are up for a bigger challenge, design your new method to be similar to filter_elements in that a user can write their own functions or classes that can be passed to it (as arguments) in order to accomplish something useful that they may want to be customized.
| We will count the use of the  | 
Make sure to run the pytest tests from a bash cell in your notebook.
%%bash
cd $HOME/project05
python -m pytest- 
Code used to solve this problem. 
- 
Output from running the code. 
| Please make sure to double check that your submission is complete, and contains all of your code and output before submitting. If you are on a spotty internet connection, it is recommended to download your submission after submitting it to make sure what you think you submitted, was what you actually submitted. |