Misto's Blog

Posted on Apr 9th 2025 at 07:58:32 PM by (Misto)
Posted under devlog, development, site, rebuild

Hello again!  I'm back to discuss the progress on the site rebuild.  For this entry I'm just going to talk a little more about the issues I uncovered while performing data migration into the shiny new database.

Before I get into too much details, I would like to say that the migration efforts are actually coming along quite nicely and I think I'm nearing the final steps of moving everything over with the games (aka games & variants) table being the last one I'm in the process of tackling.

Now let's get into the issues during the migration itself and how some of them were resolved.  First off, this was a fairly big undertaking.  I already talked in detail about the issues with free form fields in the data in the last dev log.  All of the company, person, genres, etc have been standardized in the new table.  One other free form field proved to be much more complicated - control types (controller on the game page).  This was so varied it was hard to even do accurate fuzzy matching to get good results.  The original plan was to try to map the listed control to an actual accessory, for instance "standard controller" would map to the Dualshock 2 controller for a PS2 game.  This led to me adding a field in the database to set a default controller for the console family.  This works for the most part but any specialty controllers get much harder to map without manual intervention. Furthermore, consoles like the 32X and Sega CD add-ons didn't have their own accessories - just using the ones of the genesis.  That let to another table to support compatibility mapping between consoles.  Of course that's not the last of controller/accessory compatibility, there are instances where some accessory only work with specific consoles within a family.  One example is the TurboDuo game pad is not compatible with the original Turbografx-16, this needs another table to support mapping what accessories work on individual platforms within a console family.

This might all sound a bit frustrating or counter-intuitive but it makes the data much more robust while also eventually allowing more searching, sorting, and filtering options for the new site.  Having the proper relations between all the data points is a good practice for this kind of data and will also cut down on all those errors and inconsistencies found in the existing data.

Currently, most of the data has been moved over - consoles, accessories, companies, people (shout out to ApolloBoy for helping with the person data).
I have been writing an application that will take JSON files of the data in the existing database and generating SQL queries to run in the new database.  I've used it for all of the migrations so far and the game data is no exception.  However, its also the most involved table.  For each game or variant, up to 15 different tables need to be updated.  Its a lot but ideally in the future, this will open up a lot of possibilities.  For instance, we'll be able to search for all PlayStation 4 games released in the US that are 1st person and metroidvanias.  Or we can find all the games that Square and Nintendo collaborated on.  Further, we can add things like collection stats beyond just how many games for a platform you own.  There are many more similar examples that I hope we'll be able to support in the long term.

So what's next after data is migrated?  I'll be able to start putting together some back end APIs to allow access to this data and work on getting actual site mock-ups on how things can look.  Once some mock-ups are created, I'll be sharing them in the discord for community feedback.  Its still a little early but once some of these pieces are functional, it'll also be easier to get the community involved by verifying data and looking into proposed features.  There is still work needed to migrate the forum pieces as well but I'm mostly focusing my efforts on the main collection tracking portions to start with.

I hope this helps give a glimpse into what's happening behind the scenes.  If there are any questions/thoughts/concerns you can put them in the comments below or reach out to me in discord.



Posted on Mar 18th 2025 at 01:50:09 AM by (Misto)
Posted under devlog

Hello, its time for the second entry discussing progress on the RFG rebuild.  I know its been a little quiet on the front page in this regard but work is progressing.  This time I'm mostly going to focus on the database work that is ongoing.

As I mentioned in the previous dev log, we have a large amount of data that we are planning to transform into a new structure to allow us more flexibility in adding game data.  However, this means we need to take all the existing data and move it to a new database.  This isn't a simple task since we are not performing a one-to-one migration.  A lot of the data needs to be cleaned up and reformed into the new structure.  The first hurdle is fixing the developers and publishers fields.  If you've ever added an entry to the site, you know that those fields are free form text - meaning you can put any text you want in there.  The problem that has is we have a lot of duplicate entries, spelling errors, and different styles people have used.  Take, for instance, a game with two developers (A and B) - I've seen all of the following different styles in the DB:

  • A / B
  • A & B
  • A and B
  • A with B
  • A, B
  • A; B
  • A (B)
  • A (for PC) / B (for Mac)

Not to mention duplicate entries (not including typos) for different company designations

  • A, Ltd.
  • A Ltd
  • A, Ltd
  • A Ltd.
  • A Limited

This makes it very hard to migrate into a "Company" table to normalize and very hard to manually go through the data to "fix" it.  We have nearly 15000 different unique entries just for companies.  I am slowly going through and fixing these to populate a new table that we can tie games and hardware to.  We also want to pull out all of the single person names to populate a "person" table that can be used for game credits as well as developer/publisher fields. The good news is once its done, we won't have to worry about duplicates or typos going forward.  When submitting a game, you'll just need to select the company from the existing values (don't worry there will be a way to add companies as well if needed).  Further, since its hard to verify that there still won't be duplicates after this, I'm planning on adding a "merge companies" function for site admins in case there were mistakes in the migration.

Once the companies are migrated, there will be a few smaller tables that will need the same treatment such as controller and genre fields.  Then a more programmatic approach can be taken to migrate all the actual games and hardware and tie them to new entries for all the mentioned fields - standardizing all the data.  This data work will also allow us to start building concrete APIs and UI mockups to move forward which will be much more tangible for most people reading this.

Anyway that's all for this one, I hope it was a fun read.  Once more progress is made, I imagine entry 3 will discuss some more of the data structure and API work that will need to happen to move forward.



Posted on Jan 28th 2025 at 11:57:31 PM by (Misto)
Posted under devlog, programming,development,planning,overhaul

Hello and welcome to the first RF Generation Dev Log!

For a quick introduction, I'm Misto, one of the volunteers working on bringing RF Generation into the modern era of websites.  If you are not aware, dev logs are commonly used to document and bring awareness to the efforts, struggles, plans, etc of all the work that we are doing behind the scenes throughout a project.  I'm hoping that this will be a fairly regular type of blog post just so everyone can see that, yes work is being done - whether its visible or not.  There has been a lot of discussion and mentions in the recent blogs about the site redesigns but I wanted to use this first post to outline all the steps that need to be done before anything can even move forward.  Also this post will be long so get ready.

First and foremost, as many of you know and what's been mentioned in past blogs, RFG is pretty dated.  Not that its a bad thing per se but it does limit us from expanding or building on top of what is in place.  Right now, much of the site is running on outdated software and we really don't know if any upgrades or updates will break current functionality.  Most of the original developers have since moved on as well, so we are stuck with a bunch of hacked together legacy code.  What I've been working on currently, is creating a separate server that mimics the current site configuration but running on the latest and greatest (operating system, hardware, PHP/Perl versions, and database version).  I did manage to get the OS and database migrated successfully but I'm still working on getting the actual site code moved over.  The test server I'm running on is in AWS (Amazon Web Services) which has some free tiers which are perfect for this upgrade test.  Unfortunately, we have too many images in the database and they don't fit on this tier (free tier only has 60 GB of hard drive space and we have about 25 GB of games images alone!).

This brings me to my next point - costs.  The unfortunate reality is hosting a website outside of a basic blog isn't cheap.  A lot of software and hardware needs to come together to make anything run and it all costs money.  Just knowing that we can't run the server on a small server with a 60GB hard drive, means we need to pay for more space on the server.  I mentioned before that I'm using AWS to host a test server.  I ran through multiple configurations to determine costs based on the requirements I think we would need in a worst case.  Requirements are the obvious like CPUs and hard drive space, but also the less obvious - how many visitors does the site get per day, will the server be running 24/7 (yes for us), how many requests do we get, are there spikes in traffic (for instance does most traffic come between 10-11PM on a Wednesday), how big are each request size.  All of this needs to be taken into account.  Most basic and cheap webhosting are going to run into limitations for anything beyond a simple wordpress site.

The good news is with AWS (or similar cloud provider), I think a lot of the costs I calculated so far are around the current server costs.  But it gives us a few benefits to move.  One is scalability - did your server go down unexpectedly?  Just spin up a copy of it.  Need to make changes?  Just spin up a test server and make changes before pushing them out and hoping for the best.  There is also another potential benefit, the possibility we can get away from cPanel.  cPanel is great for a lot of things, but as we grow, its hard to use and difficult to manage with a team of programmers and other volunteers.  I can't find a good way to give access to other people without giving them full server access.  Its also a completely managed service (meaning the software, site, databases, and images) are all stored on one server (remember the 60GB limit above, cPanel takes up about 20 of that alone).  It also limits you to specific programming languages and other supporting software you can use.  Its also about $30 a month for a license.  We have discussed in the Discord a full site rebuild, using a cloud service like AWS, we can utilize services that could allow us to make a more robust site while also potentially coming in a bit cheaper than what we currently pay.

Moving on from server configurations, lets talk data.  RFG has a fantastic amount of game and hardware data but it can be organized in a better way.  A lot of you have pitched in with ideas and we've come up with a schema (how the DB is organized) that we think will address some of those issues.  First is the games - we've built in a sort of hierarchical structure to better organize them all.  I'm going to use an example for this of my favorite series - Resident Evil.  First you have a "Franchise" or the "Resident Evil" franchise, this will have any shared game info (like creator, first game release, etc).  Most of this is more for fun than anything but it could also be used to see how many games in a series you are missing.  Then we have the "game", this could be "Resident Evil (2002)" for instance.  This will have all the shared info for a specific game (alternate titles, descriptions, etc). 

Under that is what we mostly care about - the "variant".  The Variant will be things like the original GameCube North American release or the Wii re-release in Japan, etc.  These are the games that are all in our collections.  Variants will now optionally support a lot more data than we currently have
  • Credits
  • Multiple developers/publishers/distributors
  • Listing for bonus physical goods for collector editions
  • Flags for homebrew, digital only, full game on media, online only, cross platform support
  • Print run of the game
  • Inclusion of download codes
  • Required or supported controllers
  • Hardware or OS requirements for PC games
  • An appears in option - tie games to compilations
  • Flag as a pack in games for consoles
  • Standardization of genres and game modes
  • Multiple images for each "type", as in multiple front box images or cartridge images

I think that about wraps things up for this post.  As usual, let us know what features/improvements/bugs/etc that need addressed.  It may be a long time before we can get to it but I wanted to show that progress is being made behind the scenes.  I'm hoping soon we can have a better idea on what path forward we are taking (some of this is dependent on costs and how much we get in donations) but in the meantime I'm working on getting a git repository of the current code setup to give developers access as well as somewhere we can document and submit issues or requests in a more formal way.


                                                                                                                                                                                                                                               
This is Misto's Blog.
View Profile | RSS
Blog Navigation
Browse Bloggers | My Blog
Hot Entries
Hot Community Entries
Site content Copyright © rfgeneration.com unless otherwise noted. Oh, and keep it on channel three.