Developer Doc

This document explains more than you ever wanted to know about the Data Generator: how it works, how it's structured, and how to extend it. May the thrills commence.

Introduction

If you haven't already done so check out the script online and generate some data. You should have a general sense about what the script does before you bother reading any further.

Version 3.0.0 of the Data Generator was a complete redesign of the script to make it properly modular: now it's primarily an engine that includes an interface, installation script, user account system, and a standardized way for plugins to be integrated with the script. The really interesting part is the plugins themselves: they provide all the functionality of the script. And that's where you come in.

The Developer Doc focuses on how you, as a developer, can write new modules. Different module types require degrees of technical knowledge: providing new translations is very basic; providing new country plugins is fairly simple, but requires basic PHP; creating new Export and Data Types are complicated and will require both JS and PHP expertise. But I'm getting ahead of myself... let's start with an overview to the module types.


Module Types

Note: the words plugin and module are used synonymously.

The Data Generator accommodates the following types of module:

Data Types

These govern what kind of data can be generated through the interface. You get a huge amount of control and customizability out of these suckers. For example:

  • They can generate anything you want - strings, numbers, URLs, images, binary data, code, ascii art, you name it.
  • They can display any arbitrary settings to allow in-row configuration by the user, customizing the particular output for the Data Type row.
  • You can add custom JS validation to ensure the values are well formed.
  • They can access and depend on other Data Types in the generated result sets to customize their output.
  • They can generate different content depending on the selected Export Type (HTML, CSV, XML, etc.) and the export target (in-page, prompt to download, new tab).
More about Data Types »
Export Types

Export Types are the formats in which the data is actually generated: XML, HTML, CSV, JSON, etc.

More about Export Types »
Country Plugins

In order to generate realistic-looking human-related data, you need to actually provide the data set to pull from. The Country plugins let you do just this: you provide some data country, regions and cities for a particular country. This allows various Data Types to intelligently generate rows of data with regions, cities and postal codes that match the country selected. These are very simple plugins to create.

More about Country Data »
Translations

The entire Data Generator interface is translatable. At the top right of the interface, there's a dropdown that lists all available languages. The default languages other than English were auto-generated with Google Translate. As such, they're in need of proper translations! Click the button below to learn more about translations and how to provide your own / update the existing ones.

More about Translations »

Code Architecture

A few words on how the code is organized, to give you a sense of how it all fits together.

PHP

The settings.php found in the root folder contains the unique settings for the current installation - MySQL database settings, and so on. This file is automatically created by the installation script. This is the only file that contains custom information for the installation.

The PHP codebase is object-oriented, with all core classes found in /resources/classes/. The library.php file - again found in the root - is used as the main entry point: all code that needs access to the core codebase just needs to include that single file.

Core.class.php

The Core.class.php file is special. It's a static class (or would be if PHP permitted it!) that acts as the global namespace for the backend code. When Core::init() is run, it does all the stuff you need to run the script, namely:

  • Parses the settings.php file and stores all the custom settings for the environment.
  • Makes a connection to the database.
  • Automatically handles serious errors like database connection problems, or Smarty not being able to generate the page due to permission errors.
  • Loads up all Data Types, Export Types and Country plugins and renders them appropriately on the screen.
  • Loads the current language file.
  • Lots of other nagging juicy stuff.

It also contains numerous helper functions. Check out the source code for more details.

Smarty Templates

So... where the hell's the markup? If you're anything like me, you hate examining a new codebase to find you can't even track down the HTML. I know, it's annoying. Check out /resources/templates/. That contains the bulk of the HTML used to generate webpages. You can read more about Smarty on their website. The script uses version 3.

Custom Smarty Functions

When you look through the templates, you may notice the occasional non-standard Smarty function, like {country_plugins}. These are all found in /resources/libs/smarty/plugins/. That's actually the same folder as all the default Smarty modules and functions. If you're not familiar with Smarty, it's worth fishing through that folder to get an idea of how those files map to actual functions and modifies that you can use in the Smarty templates.

JavaScript

The client-side code is built around requireJS. All the JS module code works the same way, regardless of whether the code is the Core, for a Data Type or Export Type. Country plugins are entirely PHP - no JS required.

  • Each module is sandboxed by RequireJS, to ensure it doesn't pollute the global namespace.
  • Modules interact with one another using publish / subscribe messages, not by calling one another directly.
  • All modules register themselves with the Manager, which is found here: /resources/scripts/manager.js. The Manager handles all pub/sub messaging and ensuring that the module being registered contains all the required functions in order to integrate with the script.
  • All save/load functionality for a Data Type and Export Type is done via the JS module. When the user saves or loads a data set via the interface, the core script calls all appropriate module's JS module files that serialize the data for database storage, or are passed the information to re-populate the page data. It's actually pretty simple once you see it in action: see the Data Types or Export Types pages for more information.

The pub/sub messages can be viewed right in the Data Generator by going to the Settings tab and choosing which information you want to see in your browser console through the Developer section.

See the appropriate module documentation section for more info on how all this works from a practical viewpoint.


Translations / I18N

The Data Generator has built-in multi-language support; a user can easily change the UI language via a dropdown and the page automatically redraws with the new language. This means all language strings for the Core and all plugins need to be extracted and placed

After a lot of humming and hawing, I decided to use a simple PHP array to store the language strings. The Core language strings are found in /resources/lang/. There you'll see there's a separate file for each language. The Data Types and Export Types all have their own language files which need to be stored in a /plugins/[data-or-export-type-folder]/lang/ folder. When the user picks a language through the interface, the Core script automatically figures out what language files are present and attempts to load the right one. If it can't find it, it will load the default English one (yes, an en.php language file is required for Data and Export Type modules).

Google Translate auto-translations

The base translation is provided by Google Translate. It's pretty poor, but it's better than nothing. I'd LOVE people to help improve the translations! For more info on that, see the Translations page.


SASS

In the unlikely event of you needing to tweak the CSS for the Core script, bear in mind it's auto-generated based on SASS templates so updating the CSS is the wrong way to go about it. You'll need to edit the SASS files at /resources/themes/[theme]/sass/.

Check out sass-lang.com for more information.


Resource Minification and Bundling

In version 3.0.7, I added optional CSS and JS minification and bundling to speed up load times. I figure in most scenarios you won't much care about this, since it'll be running in a local environment and will be pretty speedy to load anyway. But if you do want to improve page load times and reduce the total bandwidth for the script, here's how to go about it. Please bear in mind I really only added this for the sake of the public website, so it's not as simple as the rest of the script to get going. Apologies.

How to set up

  1. To create the minified and bundled files, you need to have Node installed. Download it from nodejs.org/. It runs on all platforms.
  2. After installing it, in Terminal / the Windows command prompt, navigate to the generatedata folder and type the following command: npm install. If node is configured, it will download all the necessary files.
  3. Next, type npm install -g grunt-cli. That installs the grunt command line tools so you can run grunt from the command line.
  4. In Generate Data, click on the "Reset Plugins" button on the Settings tab.
  5. On the command line, type grunt. That should create all the new files.
  6. In your settings.php file, add the following line: $useMinifiedResources = true;

And that should be it. It should have created a number of different files in your /cache folder (CSS and JS), which should now be linked to in the script's webpages.