Disclaimer and a note about LLMs in software engineering: I used ChatGPT for all of the debugging being talked about in this post. Although, since this was a particularly nasty issue, ChatGPT started hallucinating after the complexity reached a certain threshold. I do see the point in LLMs for software engineering. It can quickly throw ideas at you to consider but those ideas are not necessarily high quality and can even make you go down the wrong rabbit and hence waste time if you're not thinking critically. So its ok, even necessary to use LLM's but make sure to think critically about what the LLM is outputting. Its a fine line.
So, this weekend I decided to learn pydantic. Hence, I setup a git repo, wrote some demo pydantic code and because previously I learnt about pre-commit, I also setup pre-commits for this repo (which had the mypy pre-commit setup). And then when I tried to run pre-commit run --all-files, I got an error : pydantic\pydantic_models.py:6: error: Module "pydantic" has no attribute "BaseModel" [attr-defined]. This was from pre-commit running the mypy pre-commit on one of the files where I had imported BaseModel from pydantic.
Hmm. Looking into this lead to pydantic's official site talking about how to enable a plugin for mypy to play well with pydantic. I said "sure, this doesnt seem too complicated. Let me just do it". The solution is to add a mypy.ini file and add the below to the file.
[mypy]
plugins = pydantic.mypy
Having done this, I ran pre-commit run --all-files and it didnt work. Thats software engineering for ya. What was the error now? pre-commit mypy mypy.ini:2: error: Error importing plugin "pydantic.mypy": No module named 'pydantic' [misc]. And this took me some time to figure out. Trawling through git issues I came across this relevant issue. Seems like I just need to add pydantic as an additional dependency for the mypy pre-commit in .pre-commit-config.yaml.
I did that and ... it didnt work. Well, you get used to it after a while.
What was the issue now?
File "C:\Users\[username]\.cache\pre-commit\repob60u4mod\py_env-default\lib\site-packages\pip\_vendor\appdirs.py", line 486, in _get_win_folder_from_registry
dir, type = _winreg.QueryValueEx(key, shell_folder_name)
FileNotFoundError: [WinError 2] The system cannot find the file specified
Note that installing mypy into the python env and running mypy on the file does not lead to any issue. This only occurs when mypy is running via the pre-commit.
What even is going wrong in that message? It seems the temporary python env which pre-commit creates (to run mypy) uses a portable version of python which when trying to use PlatformDirs, tries to get a file location using CSIDL_COMMON_APPDATA and since this is a portable python version, it does not have access to the registry on windows and hence fails. I'm not actually entirely sure if this is true but reading the message and from this Im fairly certain that Im on the right path.
Here's an issue from pre-commit github repo where someone mentions that it works in one environment and not in another. Some mickey mouse shit going on here, huh.
And honestly at this point, I had no idea what I could do to fix this because the issue is from pre-commit. Now originally I had created my repo with python 3.9 and ran the pre-commits with all the pydantic stuff and it worked. But now that I added pydantic as an additional dependency for mypy in .pre-commit-config.yaml I figured that that was what was causing the issue. So I removed all that and ran pre-commit run --all-files and it gave me the same FileNotFoundError. My jaw dropped. I undid the breaking changes ie, restored the state of my project to a state where everything worked and now magically its not working? Yea, another thing you get used to in software engineering. I just forgot about it since its been awhile this happened to me. Reminds me of the "It works on my system" meme.
So... what now? I thought "Well, let me just recreate a new git project with a barebones .pre-commit-config.yaml and see if that works. I did and it didnt work. Surprise? Not really. So then it dawned on me that this issue isnt even related to mypy+pydantic+pre-commit. Its just related to pre-commit. Because in this barebones project I didnt have pydantic and mypy. And this was what was particularly hard and time consuming (a weekend worth time consuming). But after looking at older environments where pre-commits worked, I hypothesised that maybe its the python version which is cauing the issue. I was using python 3.9 for this repo and other repos where I was getting the FileNotFoundError. Whereas when I updated the python version to 3.11, it worked. So then its clear then that thats the issue. I havent found any relevant information online about this. So who really knows.
But what confused me was that when I started on this journey, I had created the python environment with python 3.9 and ran pre-commit with mypy (but without pydantic) and it worked. So then why now does it not work? I think its because pre-commit uses cached python versions to run the pre-commits if its the same .pre-commit-config.yaml file. So before this I had run pre-commit with python >3.9 and so the first time when I ran pre-commit on my new repo, instead of using python 3.9, it used python >3.9 and it worked. But then I made a change to the .pre-commit-config.yaml by adding pydantic as an addtional dependency. And now it created a python environment with python 3.9 and that broke (the project and my mind).
Its quite weird why pre-commit behaves this way. And Im not entirely sure if my analysis is correct. Its the only explanation that makes any sense. But if this is so, then its a bug in pre-commit.
Well in any case, how do we fix this. I either use python >3.9 in my environment. But what if I had a requirement to use python 3.9 in my project? Am I screwed? Not really. We can specify the python version that should be used by pre-commit in .pre-commit-config.yaml
default_language_version:
python: python3.11
We can also specify this specifically for each hook ie, each hook can run with different python versions. And I guess if you dont specify this it uses your main project's environments python version. But as Ive learnt, that can have the issue which is the whole point of this post.
But in the end, frustrating as it was, I learnt a lot and this also where the fun in software engineering comes from. Solving problems. I live for it.
And thats it. Thanks for coming to my TED talk. So long and thanks for all the fish.