Allow the script to generate more realistic, country-specific data.
The primary purpose of the script is to generate realistic-looking fake/test data. So when it comes to human-centric geographical information, it needs the actual raw data - city and region names - in order to do its job. That's where the Country plugins come in: they let you provide the following information about any country:
The Country plugins are currently pretty basic. Right now, all they're used for is to try to keep the data across a single
generated row looking as consistent as possible. So if the generated row contains "Canada" for the Country
field, it will pick a Canadian province for any Region
fields, and any cities within that region for any
City
fields. A few more interesting caveats:
Country
row, it will randomly
pick any country from the list (200 or so). If the country being outputted for a row doesn't have a corresponding
Country-plugin, it will arbitrarily pick any region, city and postal/zip code (since it won't know any better!)Country
or Region
field, the cities
will just be arbitrarily chosen.Adding your own country-plugin is very simple. Knowing a little PHP would help a lot, but with common sense and a bit of patience, you can probably get by just fine. But before we get into the details, remember this:
Important: the purpose of a Country plugin isn't to provide a 100% accurate, 100% complete list of regions and cities for a country: it's to provide enough information so that the generated data looks valid.
If you were to add in every region and every city/town within a country, the data set could get extremely large, which could slow down the data generation.
Now that's over with, here's how to create a
/plugins/countries
folder, create a new folder for your country. The folder name
should be the country name with no spaces, and camel-case - i.e. an upper case letter for each word
in the country name, like PapuaNewGuinea
.
PapuaNewGuinea.class.php
(where PapuaNewGuinea
is the name of the folder you just created) and add in the following PHP.<?php /** * @package Countries */ class Country_PapuaNewGuinea extends CountryPlugin { protected $countryName = "Papua New Guinea"; protected $countrySlug = "papuanewguinea"; protected $regionNames = "Papua New Guinean Provinces"; protected $continent = "oceania"; protected $countryData = array( array( "regionName" => "Province Name 1", "regionShort" => "PN1", "regionSlug" => "province_name_1", "weight" => 1, "cities" => array( "City Name 1", "City Name 2" ) ), array( "regionName" => "Province Name 2", "regionShort" => "PN2", "regionSlug" => "province_name_2", "weight" => 1, "cities" => array( "City Name 3", "City Name 4" ) ) ); public function install() { return CountryPluginHelper::populateDB( $this->countryName, $this->countrySlug, $this->countryData ); } }
_YourCountry
. e.g.
class Country_YourCountry extends CountryPlugin {
africa
, asia
, europe
,
central_america
, north_america
, oceania
,
south_america
.
And that's it!
You may have noticed that in the PapuaNewGuinea
example above, there was no Zip / Postal code
data or Phone Number formats added for the country. Country plugins are designed to be flexible enough to
add any country- or region-specific format.
The basic pattern to adding extended data is to create two things:
$extendedData
` protected member variable in the class that contains the default values
for the extended data.
$countryData
, define whatever region-specific data is needed.
Here's a couple of existing Data Types that use this feature.
At the time of writing, the only two Data Types that make use of country extended data is the Postal/Zip and Phone-Regional data types. They both generate as appropriate a value as they can, based on the selected countries and the value for the Country and Region field in the data set.
Here's the first few lines of the Costa Rica Country Data Type. Take a look at
the $extendedData
variable.
<?php /** * @package Countries */ class Country_CostaRica extends CountryPlugin { protected $continent = "central_america"; protected $countryName = "Costa Rica"; protected $countrySlug = "CR"; protected $regionNames = "Costa Rican Provinces"; protected $extendedData = array( "zipFormat" => array( "format" => "ZYxYx", "replacements" => array( "Z" => "1234567", "Y" => "01", "x" => "0123456789" ) ), "phoneFormat" => array( "displayFormats" => array( "xxxxxxxx", "xxxx-xxxx" ) ) ); // ...
The $extendedData
variable can store whatever information is needed. For the zip format
it stores a general zip format for the whole country and a list of replacement values that are used
to generate the zip. The phone number needs a list (one is fine) of possible display formats for the
phone number. These are selectable via the UI.
Note: the Data Types are what handles all the actual data generation. The developers of those plugins decide the structure of the extended data (for that section) and what info needs to be supplied. As a Country plugin developer you just need to follow the pattern set out in other Country plugins.
To provide region-specific data, you'll need to include an extendedData
key in the
region's data section, like as followed:
protected $countryData = array( array( "regionName" => "Alajuela", "regionShort" => "A", "regionSlug" => "alajuela", "weight" => 20, "cities" => array( "Alajuela", "Quesada", "San José de Alajuela", "San Rafael" ), "extendedData" => array( "zipFormat" => array( "format" => "2zxYx", "replacements" => array( "z" => "01", "Y" => "01", "x" => "0123456789" ) ), "phoneFormat" => array( "format" => "24xxxxxx" ) ) ), array( // ...
That will be used by the various data types to override the default values and provide more realistic data for the country.
To keep your Country plugin up to date with whatever extended data is generally used, I'd suggest looking through the various existing Country plugins and seeing what's defined. Extended data should be optional, but naturally you'll want to make your plugin as compatible with as many Data Types as possible.
Sharing is much appreciated! To contribute your plugin, please just fork the project on github and submit your changes via a pull request. This is certainly the preferred method to contribute code, but if you don't think you're up for it you can always email me and I'll manually add it in. Please note, all contributions will be expected to be available under the GPL license and released along with the rest of the code. I'll be sure to add in your name as a contributor.