Provide new types of data for generation.
This page explains how to add your own data types so you can use the Data Generator to generate pretty much whatever crazy stuff you want.
Data Types are self-contained plugins that generate a single random data item, like a name, email address, country name,
country code, image, picture, URL, barcode image, binary string - really anything you want. Data Types can offer basic
functionality, like the Email Address
Data Type which has no options, examples or help doc, or they can
be more advanced, like the Date
Data Type, which contains examples of date formats for easy generation,
and contains a date picker dialog (jQuery UI). Data Types can be standalone and generate data that has no bearing on
other fields - like the Alpha Numeric
Data Type - or make decisions about its content based on other fields
in the data set, like Region
, which intelligently generates a region within whatever country has been randomly
generated for that row. Finally, if you want to get really fancy, you can even create Data Types that
generate content based on previously generated row data, like the Tree
Data Type that
creates a tree-like data structure by mapping the ID of each row to a single parent row ID.
Data Types have both a PHP and (optional) JS component. The PHP is used to do the actual generation; the JS is used for creating the UI and saving/loading the Data Type data.
When creating your new Data Type, you can add anything you need from client-side validation to custom dynamic JS/DOM manipulation. You can also generate different content based on the selected Export Type (SQL, XML etc). It's a pretty flexible system, so hopefully you won't run into any brick walls. And if you do, you can just drop me a line and explain the shortcomings.
Lastly, I tried to make the process of adding Data Types as simple and as sandboxed as possible. The Core script does an awful lot for you: all you really need to do is follow the instructions below and maybe look at the existing Data Types for inspiration. Once you wrap your head about how it all fits together, developing new Data Types should be pretty straightforward.
Alrighty! Let's start with looking at the actual files and folders that go into a Data Type.
Now let's do a high-level view of what goes into a module: the files and folders, the JS + PHP components and how the translations / internationalization works. We'll get into the details about the code in the following sections.
All Data Types are found in the /resources/plugins/dataTypes/
folder. Each Data Type has its own folder,
which acts as the namespace for the JS and PHP code. What I mean is that the exact string you choose for the folder (like
AlphaNumeric
or StreetAddress
) has to be used in your JS module creation and PHP
class definition. I'll explain all that below.
A Data Type has the following required files. Let's assume the folder name is MyNewDataType
.
/resources/plugins/dataType/MyNewDataType.js
: this file can actually be called whatever you want, but for
consistency and for keeping reading the Web Inspector / Firebug net panel, I'd name them like this. You can have
as many JS files as you want, but one is almost certainly enough./resources/plugins/dataType/MyNewDataType.class.php
: this contains your DataType_MyNewDataType
class, which handles all necessary server-side code: the data generation and any markup you want available in the
generator webpage. More info about all that below./resources/plugins/dataType/lang/en.php
: A PHP file containing a single array (hash) that lists all
strings used in your module.You can also include any custom CSS files you want. See the PHP class definition below for more information
The JS module for your Data Type does the following:
Manager
JS component, to allow it to publish and subscribe to messages; i.e.
to interact with the Core script and detect when certain user interface events happen.The PHP class for your Data Type handles the following functionality:
Example
and Options
columns in the generator table.Region
field can check to see if a
Country
field has been included, and if so, generate a random region within the country for that row.
All text strings that appear in your module should be pulled from a language file. It's very simple. Just create a
file called en.php
in your /resources/plugins/dataTypes/[data type folder]/lang/
folder.
That file should contain a single $L
hash, like so:
<?php $L = array(); $L["DATA_TYPE_NAME"] = "Alphanumeric"; $L["example_CanPostalCode"] = "(Can. Postal code)"; $L["example_Password"] = "(Password)"; // ...
Once you do that, the Data Generator automatically makes that information accessible to your PHP and JS code. I'll explain how that works in the following sections.
All plugins - Data Types
, Export Types
and Country
plugins have to extend
a base, abstract class defined by the core code. Hopefully you know what this means, but if not - time for some
Googling! Simply put, abstract classes are a mechanism to help ensure that the class being defined has a proper
footprint and contains all the functionality that's expected and required.
For Data Types, take a look at this file: /resources/classes/DataTypePlugin.class.php
. That's the
class you'll need to extend.
Now rather than blather on about your Data Type PHP class in the abstract, let's look at an actual implementation
first. If you want to see the complete list of available variables and methods, check out the source code
of the Data Type abstract class (/resources/classes/DataTypePlugin.abstract.class.php
). It's well
documented.
This is the PHP class for the GUID
class. It's a simple Data Type that generates a random GUID string.
Maybe first try it out in the script to see what it does.
<?php /** * @package DataTypes */ class DataType_GUID extends DataTypePlugin { protected $isEnabled = true; protected $dataTypeName = "GUID"; protected $dataTypeFieldGroup = "numeric"; protected $dataTypeFieldGroupOrder = 50; private $generatedGUIDs = array(); public function generate($generator, $generationContextData) { $placeholderStr = "HHHHHHHH-HHHH-HHHH-HHHH-HHHH-HHHHHHHH"; $guid = Utils::generateRandomAlphanumericStr($placeholderStr); // pretty sodding unlikely, but just in case! while (in_array($guid, $this->generatedGUIDs)) { $guid = Utils::generateRandomAlphanumericStr($placeholderStr); } $this->generatedGUIDs[] = $guid; return array( "display" => $guid ); } public function getHelpHTML() { return "<p>{$this->L["help"]}</p>"; } public function getDataTypeMetadata() { return array( "SQLField" => "varchar(36) NOT NULL", "SQLField_Oracle" => "varchar2(36) NOT NULL" ); } }
Let's look at each line in turn.
class DataType_GUID extends DataTypePlugin
: our class definition. All Data Type class names
must for of the following format: DataType_[folder]
- where folder is the name
of the Data Type folder. Pretty straightforward. Also, note that it extends the DataTypePlugin base class.
That's required.$isEnabled
: this var explicitly enables/disables the module. In case you're tinkering around with
a new Data Type, sometimes you may not want it to show up in the UI - so you'd just set this to
false
.$dataTypeName
: this is the human-readable name of your module. It can be in whatever
language you want, but we prefer English as the default language string. The value you enter in this variable
is automatically overridden if the current selected language has the following value in the language file:
$L["DATA_TYPE_NAME"] = "New Name";
This provides a simple mechanism to provide alternative translations
of your Data Type names.$dataTypeFieldGroup
: in the Data Type dropdowns in the generator, you'll notice that the Data
Types are all grouped. This variable determines which group your Data Type should appear in. You can choose any of the
following strings: human_data
, geo
, text
, numeric
, math
,
other
. If you feel that you need a new group for your Data Type, drop me a line.
$dataTypeFieldGroupOrder
: this determines where in the list your Data Type should appear. Look at the
the values for other Data Types to figure out what value to enter. I spaced them all out with 10 in between to allow you
to insert your Data Type at any point in the list.
So far so good. The next line, $generatedGUIDs
is a custom private var for use by this Data Type only. Don't worry
about it.
Now lets look at the methods:
public function generate($generator, $generationContextData)
: this is the main generation
function for the Data Type. It's passed two parameters:
Generator
class, found here: /resources/classes/Generator.class.php
. This is a very helpful class -
it contains various utility methods for finding out about the current data set being generated. However, the
GUID
class doesn't need it.Generator
generates the data sets row by row. Each row contains
one or more Data Types. This variable contains all the Data Types generated so far for the current
row. Any Data Type can choose to return additional meta data for a particular generated atomic data - e.g. a
Region could choose to return the Country to which is belongs. This second function param contains all that
information. Lastly, if a Data Type has dependencies on previous Data Types in the row, it needs to
set the protected $processOrder = X;
class variable. See the Data Type plugin abstract class
for more information about that advanced feature - or look at the Region
plugin for an example
of how it's used.
public function getHelpHTML()
: this optional function is used to return whatever help text you want for your
Data Type. Note that the returned string references a $L
class variable: $this->L["help"]
. The
$L
variable is populated with the current language file automatically when the Data Type is instantiated.
This mechanism is taken care of for you - you can safely refer to $this->L
throughout your own class.
public function getDataTypeMetadata()
: this optional function returns additional meta information about
your Data Type. Right now it's really only used for the SQL
Export Type. When the user selects SQL, the code needs to
know how large a database field should be created for the data. As such, this function returns that information - for both
generic SQL and Oracle SQL, so the Export Type can do it's job. As mentioned, this is not a required function. If it wasn't
supplied, the SQL
Export Type would just provide its best guess.
And that's it for our example. The following sections go into greater depth regarding the class member vars and methods. There's a lot more you can do.
Alright! Here's the full list of class vars that have special meaning.
Var | Req/Opt | Type | Explanation |
---|---|---|---|
$dataTypeName | required | string | The human-readable name of the Data Type used in the UI. Note: the $L["DATA_TYPE_NAME"]
defined in a language file will override this value. |
$dataTypeFieldGroup | required | string |
Data Types are grouped together in the Data Type dropdowns in the UI. This variable lets the system
know to which group your Data Type should belong. Possible values are: human_data ,
geo , text , numeric , math , other .
If you feel that you need a new group for your Data Type, drop me a line.
|
$dataTypeFieldGroupOrder | required | integer | The order in which the Data Type should appear within the group specified by the previous field. |
$isEnabled | optional | boolean | Hides / shows the module from the interface. Note, you'll need to refresh the list of plugins after changing this value. |
$jsModules | optional | array | An array of JS filenames, all found in the Data Type folder. |
$cssFiles | optional | array | An array of CSS filenames, all found in the Data Type folder. |
$L | auto-generated | array | Do NOT define this variable. When the Data Type is instantiated, this variable is auto-generated and populated with the appropriate language file. |
Req/Opt | required |
---|---|
Params |
|
Explanation | This does the work of actually generating a random data snippet. Data Types have to return a hash with at least one key: "display". They can also load up the hash with whatever else they want, if they want to provide additional meta data to other Data Types that are being generated on that row (e.g. Country, passing its country_slug info to Region) |
Req/Opt | optional |
---|---|
Params |
$runtimeContext: Data Types classes are instantiated at different times in the code. This parameter
is a string that describes the context in which it's being instantiated: ui / generation
|
Explanation |
An optional constructor. Note: this should always call parent::__construct($runtimeContext); .
|
Req/Opt | optional |
---|---|
Params | None |
Explanation |
This is called once during the initial installation of the script, or when the installation is reset (which
is effectively a fresh install). It is called AFTER the Core tables are installed, and you can rely on
Core::$db having been initialized and the database connection having been set up.
|
Req/Opt | optional |
---|---|
Params | None |
Explanation | If the Data Type wants to include something in the Example column, it should return the raw HTML via this function. If this function isn't defined (or it returns an empty string), the string "No examples available." will be outputted in the cell. This is used for inserting static content into the appropriate spot in the table; if the Data Type needs something more dynamic, it should subscribe to the appropriate event. |
Req/Opt | optional |
---|---|
Params | None |
Explanation | If the Data Type wants to include something in the Options column, it must return the HTML via this function. If this function isn't defined (or it returns an empty string), the string "No options available." will be outputted in the cell. This is used for inserting static content into the appropriate spot in the table; if the Data Type needs something more dynamic, it should subscribe to the appropriate event. |
Req/Opt | optional |
---|---|
Params | None |
Explanation | Returns the help content for this Data Type (HTML / string). |
Req/Opt | optional |
---|---|
Params |
|
Returns |
|
Explanation | Called during data generation. This determines what options the user selected in the user interface; it's used to figure out what settings to pass to each Data Type to provide that function the information needed to generate that particular data item. Note: if this function determines that the values entered by the user in the options column are invalid (most likely just incomplete) the function can explicitly return false to tell the core script to ignore this row. |
Req/Opt | optional |
---|---|
Returns | array |
Explanation | Used for providing additional metadata about the Data Type for use during generation. Right now this is only used to pass additional data to the SQL Export Type so it can intelligently create a CREATE TABLE statement with database column types and sizes that are appropriate to each field type. |
The following methods are defined on the Data Plugin abstract class, which you can use when developing your Data Type.
Function | Explanation |
---|---|
getName() | returns the Data Type name. |
getIncludedFiles() | returns list (array) of included files. |
getDataTypeFieldGroup() | returns the field type group to which this Data Type belongs. |
getDataTypeFieldGroupOrder() | returns the order of the field type group. |
getProcessOrder() | returns the Data Type process order. |
getPath() | returns the path to the Data Type file. |
getJSModules() | returns the array of JS modules. |
getCSSFiles() | returns the array of CSS files for the Data Type. |
isEnabled() | returns whether or not the Data Type is enabled or not. |
Each Data Type may choose to have an optional JS component: a javascript module that performs certain functionality like saving/loading the data type data, running client-side validation on the user inputs (if required) and triggering whatever additional JS code is necessary.
The JS module is optional. The Core script handles saving and loading the Column Title and Data Type for all Data Types, so if you don't need anything in the Example or Options columns, you don't need to include a JS module.
Explaining how the JS module works can be a little abstract, so let's start with an example.
The following is the JS module for the Alphanumeric
Data Type. Give it a look over, then we'll
pull it apart and explain each bit below.
/*global $:false*/ define([ "manager", "constants", "lang", "generator" ], function(manager, C, L, generator) { "use strict"; /** * @name AlphaNumeric * @description JS code for the AlphaNumeric Data Type. * @see DataType * @namespace */ var MODULE_ID = "data-type-AlphaNumeric"; var LANG = L.dataTypePlugins.AlphaNumeric; var subscriptions = {}; var _init = function() { subscriptions[C.EVENT.DATA_TABLE.ROW.EXAMPLE_CHANGE + "__" + MODULE_ID] = _exampleChange; manager.subscribe(MODULE_ID, subscriptions); }; var _saveRow = function(rowNum) { return { "example": $("#dtExample_" + rowNum).val(), "option": $("#dtOption_" + rowNum).val() }; }; var _loadRow = function(rowNum, data) { return { execute: function() { $("#dtExample_" + rowNum).val(data.example); $("#dtOption_" + rowNum).val(data.option); }, isComplete: function() { return $("#dtOption_" + rowNum).length > 0; } }; }; var _exampleChange = function(msg) { $("#dtOption_" + msg.rowID).val(msg.value); }; var _validate = function(rows) { var visibleProblemRows = []; var problemFields = []; for (var i=0; i<rows.length; i++) { var currEl = $("#dtOption_" + rows[i]); if ($.trim(currEl.val()) === "") { var visibleRowNum = generator.getVisibleRowOrderByRowNum(rows[i]); visibleProblemRows.push(visibleRowNum); problemFields.push(currEl); } } var errors = []; if (visibleProblemRows.length) { errors.push({ els: problemFields, error: LANG.incomplete_fields + " <b>" + visibleProblemRows.join(", ") + "</b>"}); } return errors; }; manager.registerDataType(MODULE_ID, { init: _init, validate: _validate, saveRow: _saveRow, loadRow: _loadRow }); });
Now let's go line by line.
/*global $:false*/
this first line is for jshint/jslint. In my local environment, I use jshint with strict mode
to catch problems. This line just tells the interpreter to ignore the dollar sign. It's a global, used by jQuery.define([ "manager", "constants", "lang", "generator" ], function(manager, C, L, generator) { //... });
The outer code that wraps the entire JS module is called within requireJS's /resources/scripts/requireConfig.js
. Each of those
discrete modules is in turn passed to the Data Type module via functions in the anonymous section param to define().
Whatever public API those modules reveal are now accessible via the four params: manager
, constants
,
lang
, generator
.
When defining your own Data Type module JS file, you'll want to include all four of those params. They all contain useful functionality and data that you'll need.
"use strict";
- do it! JS strict mode is never a bad idea. :Dmanager.registerDataType(MODULE_ID, { init: _init, validate: _validate, saveRow: _saveRow, loadRow: _loadRow });
This chunk of code is required for your Data Type. What it does is register your Data Type with the core. That
allows it to listen to published events, publish its own events for other code to listen to, tie into the validation
functionality and so on. It's pretty straightforward. The manager.registerDataType()
function takes
two parameters: the unique MODULE_ID constant, defined above (see below) and an object containing certain required
and optional functions, whose property names have special values. Again, more on that below. Now let's go back to the
top of the code again.
/** * @name AlphaNumeric * @description JS code for the AlphaNumeric Data Type. * @see DataType * @namespace */ var MODULE_ID = "data-type-AlphaNumeric"; var LANG = L.dataTypePlugins.AlphaNumeric;
MODULE_ID
variable is special. It must always be of the form data-type-[FOLDER NAME].
That acts a unique identifier within the client-side code so the Manager can keep track of who's who.
L
function param fed to your Data Type contains all language
strings in the system - in whatever language is currently selected. To locate the strings for your own module,
just reference it by your Data Type folder name, again: L.dataTypePlugins.[FOLDER NAME]
As explained above, the second parameter of the manager.registerDataType()
function is an object
containing various predefined functions. This explains what are the properties for that object and what they're used
for. Note: all properties are optional, but you'll almost certainly need one or more.
Property | Params | Returns | Explanation |
---|---|---|---|
init | — | — | If this is defined for your Data Type, it gets called on page load prior to any events being published. By "event" I mean a custom published event, which I'll explain more thoroughly in the Pub/Sub section below. |
run | — | — | The run() function gets called for all Data Types and Export Types after their init()'s are called. As such, run() can rely on all subscriptions being in place so events published at this juncture will have an audience. |
saveRow | rowNum int | object | When the user saves a Data Set, the Data Generator examines the table and calls the appropriate Data Type's saveRow() method. This method is responsible for determining what information it wants to save for the row. Generally all it does is examine the DOM and extract whatever values the user entered in custom fields that the Data Type field uses. It then returns an object of simple property-value pairs. The row number being passed to this function is the unique row number for the row - it may not be the visual row number seen in the UI. After a row is created, it can be re-ordered. The row number passed to this function can be used for DOM element identification. |
loadRow |
rowNum int data object |
— | When the user loads a saved data set, the script calls each Data Type's loadRow() function, passing the appropriate row number and whatever data was originally returned by its saveRow() function. The row number should be sufficient information to identify the appropriate elements in the DOM and re-enter the saved information. |
validate | rows array | array |
When the user clicks on the Generate button, the core first validates the information they've entered. If a Data Type defines this function, it means they want to confirm the user input for one or more of their custom fields - mostly likely appearing in the Options column. The rows parameter is an array of row numbers that have this Data Set selected. As mentioned above, the row numbers may not be the visual row numbers, because rows may have been added / removed / resorted. However, it can be used to identify the appropriate DOM elements.
This function needs to return an array of errors to display - or an empty array if there are no errors. Each
array index is an object of the following form: Check out the Alphanumeric Data Type's validate() function above for an example of how this function can work. |
As mentioned elsewhere, the client-side code revolves around the idea of publish/subscribe - or pub/sub. Different parts of the script can publish arbitrary events with arbitrary information associated with them, and any module can choose to listen out for particular events and run code when they occur. This is a very elegant pattern: it allow us to keep our modules loosely coupled and reduce the likelihood of introducing dependencies that can break things.
The core script publishes the following script for certain events that occur in the lifetime of the page. They're all
found in /resources/scripts/constants.php
(returned as JS). You can refer to them in your code via the
C
parameter, mapping to the constants
module. The names are pretty descriptive so I won't
bother explaining them any further.
C.EVENT.RESULT_TYPE.CHANGE
C.EVENT.COUNTRIES.CHANGE
C.EVENT.DATA_TABLE.ONLOAD_READY
C.EVENT.DATA_TABLE.ROW.CHECK_TO_DELETE
C.EVENT.DATA_TABLE.ROW.UNCHECK_TO_DELETE
C.EVENT.DATA_TABLE.ROW.DELETE
C.EVENT.DATA_TABLE.ROW.TYPE_CHANGE
C.EVENT.DATA_TABLE.ROW.EXAMPLE_CHANGE
C.EVENT.DATA_TABLE.ROW.ADD
C.EVENT.DATA_TABLE.ROW.RE_SORT
C.EVENT.DATA_TABLE.ROW.HELP_DIALOG_OPEN
C.EVENT.DATA_TABLE.ROW.HELP_DIALOG_CLOSE
C.EVENT.DATA_TABLE.CLEAR
C.EVENT.GENERATE
C.EVENT.IO.SAVE
C.EVENT.IO.LOAD
C.EVENT.TAB.CHANGE
C.EVENT.MODULE.REGISTER
C.EVENT.MODULE.UNREGISTER
Generally you'll want to set up your subscriptions in your module's init() function. Here's how it works:
... var _init = function() { var subscriptions = {}; subscriptions[C.EVENT.COUNTRIES.CHANGE] = _onChangeCountries; manager.subscribe(subscriptions); }; var _onChangeCountries = function(msg) { console.log(msg); }; ... manager.registerDataType(MODULE_ID, { init: _init }); ...
That would subscribe to the C.EVENT.COUNTRIES.CHANGE
event (which is where the user adds/removes a country
from the Country List section in the UI) and attaches a callback function - _onChangeCountries()
. The manager.subscribe()
function can be called at any time in any of your functions, so you can subscribe to events on the fly.
I thought maybe I'd include this section on how to achieve a few practical things. Let me know if you're stuck on something and maybe I'll expand this section to explain how to do it.
If your Data Type is non-trivial, you'll probably want to include some custom HTML to appear in the Example and Options columns in the generator table. Here's how that works.
First, your PHP class above needs to define the getExampleColumnHTML()
and getOptionsColumnHTML()
methods. They should return a block of generic markup that the client-side Core code will automatically insert into any
row where the user selects your Data Type. Since that same block will be inserted for every row of your Data Type,
for anything you need to be unique - e.g. input field names and IDs, include the %ROW%
placeholder. When
the HTML is inserted into the appropriate locations in the DOM, those placeholders will be replaced by the appropriate
row number, thus allowing you to uniquely pinpoint those fields.
There are several client-side code libraries already available in the page that can be used in your Data Type:
You can always include additional libraries should you wish, but do try to namespace them.
When you add a new Data Type, just creating the new files and folders won't get it to show up in the UI. First, you'll need to follow the steps below to make sure your PHP class and JS Module has been created properly, and afterwards you'll need to refresh the UI.
To update the list of available Data Types in the UI, go to the second Settings
tab. There, click the
Reset Plugins
button. A dialog will appears which resets all the available plugins (don't worry, this
won't cause any problems with saved content or anything like that). After refreshing the page, you should see
your Data Type appear in the Data Type dropdowns in the generator.
If you feel that your Data Type could be of use to other people, send it our way! I'd love to take a look at it, and maybe even include it in the core script for others to download. Read the How to Contribute page.