Converting to Django - Webucator’s Story
Around the time Covid struck in the first months of 2020, I began working on a proof of concept for converting our development stack from ColdFusion, SQL Server, and many different JavaScript frameworks running on AWS – a stack we had been using for sixteen years – to Python, Django, PostgreSQL, and Vue.js running on Heroku. In July, 2020, we (a team of four developers) began actively working on the project. And on the evening of Thursday, July 1, 2021, we launched our new site. Although we were migrating from ColdFusion and SQL Server, our story is relevant to anyone switching from any stack (e.g., PHP and MySQL or ASP.NET and SQL Server) to Django.
Why Switch Stacks?
Our existing stack and code base were slowing us down and, more importantly, keeping us from creating new applications and redesigning existing applications. It had started with a strong base, but over the years, it had simply become too complex. We only had one developer on staff who really understood it, and the prospect of getting anyone else up to speed on the custom ColdFusion framework we had built years ago and on all of those outdated JavaScript libraries (YUI, Angular 1.5, and jQuery) that were used in different applications was daunting. It felt something like this:
I had a couple of big redesign projects in mind, and I felt it would be easier for our experienced ColdFusion developer, who had deep knowledge of our code base, to learn Python, Django, and Vue.js then it would be for me and anyone else we brought on to learn how to work effectively in our existing code base. I was a little worried that he wouldn’t be as excited about the switch as I was, but he had had his own frustrations with the code base and was excited about doing something new.
Our Pre-Plan Plan
The first thing we did was figure out what we needed to do. Here was our pre-plan plan:
- Freeze our current code.
- Make a plan to learn Python, Django, and Vue.js.
- Take an inventory of all of our sites and tools.
- Interview staff on how tools are used.
- Make a plan for tackling development.
From this, we were able to come up with a development plan and divide up the work.
The Scariest Piece: The Data
The piece I was most concerned about was importing the data from our SQL Server database to our new PostgreSQL database. Our database was not trivial. It included 216 tables containing millions of rows. Some of those tables were no longer used; others were awkwardly designed or no longer optimal for how our business had evolved. We had to figure out how to move all this data into our new Django models.
Although it is possible to integrate Django with a legacy database, it’s hard and I suspect you would lose a lot of the benefits of working with Django if you went that route. Your models would be designed based on your database tables rather than the other way around.
So, we decided to recreate all of our apps and then import our existing data into the tables generated by our Django models.
How to Maintain Relationships?
Of course, those 216 tables were related to each other in all sorts of different ways. We had to figure out how to maintain those relationships.
For example, in our legacy database, we had People
and Companies
tables, which were connected via the People.CompanyID
column. We would need to retain this relationship between our CustomUser
and Company
models in Django. To do that, we included old_id
fields in all the models. For example:
class Company(models.Model):
old_id = models.CharField(max_length=50, null=True, unique=True)
class CustomUser(AbstractUser):
old_id = models.CharField(max_length=50, null=True, unique=True)
old_company_id = models.CharField(max_length=50, null=True)
company = models.ForeignKey(
Company, on_delete=models.PROTECT, null=True, blank=True, related_name="users"
)
Our importing code looked something like this:
def create_users():
# Get all the users from SQL Server
query_file = settings.BASE_DIR / "importer/management/queries/Users.sql"
q = query_file.read_text()
cursor = execute_query(q)
rows = get_rows_as_dicts(cursor)
# Loop through the users, adding companies as new ones are found.
for row in rows:
# Add or get company
company = None
if row["old_company_id"]:
try:
# Use existing company if it exists
company = Company.objects.get(old_id=row["old_company_id"])
except:
# Create new company
company = Company(
name=row["company_name"],
old_id=row["old_company_id"],
…
)
company.save()
# Add user
user = User(
date_joined=tz.localize(row["DateEntered"])
company=company,
old_id=row["old_id"],
…
)
user.save()
cursor.close()
The code was significantly more complicated, with data cleanup (e.g. trimming, identifying and removing junk data, etc.), error checking, and reporting, but the above gives the general idea.
In some cases we needed to take a second pass through a table. For example, our Company
model includes a salesperson
field, which is a foreign key referencing the CustomUser
table. We couldn’t add that value until we were sure the user existed, so we needed to take a second pass through the company data after we had imported all the users to add salespeople to the company records.
Verifying the Data Import
It was extremely difficult to verify that we were importing all of our data correctly. We had two primary methods for doing this:
- Reporting. By recreating all of our old reports, we were able to verify that the numbers matched up. For example, we could check our monthly revenue by line of business to make sure the new reports matched the old reports.
- About three months before making the switch we made the new site available to staff and we asked them to try to do everything they do in both places. This was of course painful and time consuming, but it helped ensure that the data on the new site matched that on the old site. It also had the added benefits of:
- Training staff on Django admin before they’d have to use it in production.
- Identifying features staff needed that we had failed to bring over.
Unexpected Challenges: Docker and Heroku
We knew most of the challenges we’d face:
- Learning the new technologies turned out to be easier than we expected, and it was a ton of fun.
- Porting the data was a lot of work, even more than I expected, but I felt early on that it was doable.
- Working with staff on recreating the apps was both fun and helpful in getting a better understanding of how our own business works.
But we didn’t know about the challenges we’d face with Docker and Heroku. After more than a year of working with those technologies, I feel the same way about them that I feel about AWS:
You can always get things to work, but you often don’t quite know how you did it.
Docker and Heroku both caused us some headaches, and I still don’t feel like I’m an expert at either of them, but we have them working well.
How long did it take and how did it go?
With two full-time and two part-time developers (two computer science majors who had taken a Covid-inspired gap year from Brown University), it took 12 months to complete the conversion. This included the learning process: most of the team was relatively new to Python and Django, and also to Vue.js, which we used heavily for our self-paced course application.
We had a few minor hiccups the first week or so after launching, but we were able to resolve them quickly.
Our staff is rapidly getting used to using Django admin. We have kept our old site up (with the old data) in case people need to verify something or get access to some feature we forgot, but that’s rarely necessary. We’re planning to shut it down soon.
The part-time developers are back at college, so it’s just two of us in development now. We both love the new stack. Our new code base looks something like this:
So, overall, it has been a great success. We’re happy.