- This bundle uses
ruflin/elastica
library to query ElasticSearch
Add following to your composer.json
"revinate/analytics-bundle": "dev-master"
- Create a
Data Source
class which extendsAnalytics
- Create
Filter Sources
andCustom Filters
if required - Check \Example folder to see few examples of how to use this bundle
- Data Source is a class that exposes certain metrics, dimensions and filters that can be used to query stats
- It is backed by one or more Elastic Search Indices
- You will be able to query stats from one data source at a time
- All Data Sources extend
Analytics
Class, therefore they implementMetrics
,Dimensions
andFilters
that can work on this data source - Below you will go through documentation on how to define this interface
- A metric represents a single numerical value
- A metric has a
name
which is a human readable name andfield
which is the field name in ElasticSearch using which we calculate this metric - A metric can be defined with a
filter
- All Metrics are defined in
getMetrics()
method ofAnalyticsInterface
- Example Metrics: PageViews, PageVisits, ReviewCount, AverageReviewRating
- Metric can be of following types
Metric
: Calculated from ElasticSearch dataProcessedMetric
: Calculated from other Metrics
- A dimension is equivalent to Group By in Mysql. When you define a dimension, you can get stats per dimension value.
- A dimension has a
name
which is a human readable name andfield
which is the field name in ElasticSearch which represents this dimension - All dimensions are defined in
getDimension()
method ofAnalyticsInterface
- Example Dimensions: accountId, userId, gender, dateCreated
- A dimension can be of following types
Dimension
: Normally used forString
fieldsRangeDimension
: Normally used forNumeric
fields where you can define custom rangesHistogramDimension
: Normally used forNumeric
fields where you can define fixed dimensionintervals
DateRangeDimension
: Normally used fordate
ordatetime
fields where you can define custom date rangesDateHistogramDimension:
Normally used fordate
ordatetime
fields where you can define fixed dimensionintervals
inseconds
,minutes
,hours
,days
,weeks
oryears
- A Filter Source defines a filter that can be queried via REST
- Filter source should be defined if you want your stats filterable on that source.
- A filter source extends
AbstractFilterSource
class which requires you to implement following methods:getReadableName()
: Human readable filter nameget($id)
: ReturnsResult
by given idgetByQuery($query, $page, $pageSize)
: Returns an array ofResult
by string search querygetEntityName($entity)
: Returns entity name given entity object. Here entity can be your database php objectgetEntityId($entity)
: Returns entity id given entity object
- A filter source can also extend
AbstractMySQLFilterSource
which assumes that filter source is stored in MySQL . Apart from above methods, you need to implement following methods too:getModel()
: Returns model path in formatMyBundle:Entity
- Example Filter Sources: AccountFilterSource, UserFilterSource, GenderFilterSource
- Custom Filters complement
Filter Sources
where the filter are note based on some external source - Custom Filters need to implement
AbstractCustomFilter
to expose themselves via REST API. They need to implement following methods:getName()
: Returns non human readable name for filtergetFilter()
: Returns an\Elastica\Filter\AbstractFilter
instance
##Examples
- Checkout ViewAnalytics implementation for Data Source example
- Checkout Tests for querybuilder and api examples
You can also use Query DSL to get stats against any data source
####QueryBuilder Example:
$elastica = $this->getContainer()->get('elastica.client');
$analytics = new PageViewAnalytics($this->get('service_container'));
$qb = new QueryBuilder($elastica, $analytics);
$qb->addDimensions(array("domain", "http_method"))
->addMetrics(array("totalViews", "totalGets", "totalPosts"))
->setFilter(new \Elastica\Filter\Terms("domain", array("hotels.com")))
->setSort(array("totalViews", "asc"))
->setIsNestedDimensions(true)
;
$resultSet = $qb->execute();
$stats = $resultSet->getNested();
####BulkQueryBuilder Example:
$elastica = $this->getContainer()->get('elastica.client');
$analytics = new PageViewAnalytics($this->get('service_container'));
$bulkQb = new BulkQueryBuilder();
$qb1 = new QueryBuilder($elastica, $analytics);
$qb2 = new QueryBuilder($elastica, $analytics);
$qb1->addDimensions(array("domain", "http_method"))
->addMetrics(array("totalViews", "totalGets"));
$qb2->addDimensions(array("domain", "http_method"))
->addMetrics(array("totalViews", "totalGets"));
$bulkQb
->addQueryBuilder($qb1)
->addQueryBuilder($qb2);
// Results
$resultSets = $bulkQb->execute();
$stats = $resultSets[0]->getNested();
// Comparator Results
$compSet = $bulkQb->getComparatorSet(Percentage::TYPE /* Type of comparator */);
$compResults = $compSet->getNested();
$elastica = $this->getContainer()->get('elastica.client');
$analytics = new PageViewAnalytics($this->get('service_container'));
$qb = new QueryBuilder($elastica, $analytics);
$qb->setFilter(new \Elastica\Filter\Terms("domain", array("hotels.com")))
->setSort(array("totalViews" => "desc"))
->setSize(10)
->setOffset(0)
;
$resultSet = $qb->execute();
$documents = $resultSet->getDocuments();
- Add following config to project's config.yml
revinate_analytics:
sources:
<source name>: { class: <class path> }
api: { path: <rest api path> }
Example:
revinate_analytics:
sources:
page_view: { class: \Revinate\AnalyticsBundle\Analytics\PageView }
review: { class: \Revinate\AnalyticsBundle\Analytics\Review }
api: { path: '/api/analytics/' }
Note: default value of path.api
is /api/analytics/
- Add the following to your routing.yml file
revinate_analytics:
resource: .
type: revinate_analytics
- Clear cache and you should be able to see following new routes using
console router:debug | grep revinate_analytics
Route Name: revinate_analytics_source_list
Example Request:
GET /api/analytics/source
Example: GET /api/analytics/source
Example Response:
{
"page_view": {
"name": "page_view",
"_link": {
"uri": "https://www.mydomain.com/api/analytics/page_view/source",
"method": "GET"
}
},
"review": {
"name": "review",
"_link": {
"uri": "https://www.mydomain.com/api/analytics/review/source",
"method": "GET"
}
}
}
Route Name: revinate_analytics_source_get
Example Request:
GET /api/analytics/{sourcename}/source
Example: GET /api/analytics/page_view/source
Example Response:
{
"dimensions": [
{
"name": "all"
},
{
"name": "domain"
},
{
"name": "http_method"
}
],
"metrics": [
{
"name": "totalViews"
},
{
"name": "totalGets"
},
{
"name": "totalPosts"
}
],
"filterSources": [
{
"name": "User",
"field": "user_id"
},
{
"name": "Domain",
"field": "domain"
}
],
"customFilters": [],
"_links": { # More info about how to query stats and filter sources
"stats": {
...
},
"filters": {
...
}
}
Route Name: revinate_analytics_filter_query
Example Request:
GET /api/analytics/source/{sourcename}/filter/{filtername}/option
Example: GET /api/analytics/source/page_view/filter/domain/hotel
Example Response:
[
{
"id": "123",
"name": "hotels.com"
},
{
"id": "125",
"name": "hoteliers.com"
},
...
]
Route Name: revinate_analytics_stats_search
POST /api/analytics/source/{sourcename}/stats
{
"dimensions": ["dimension1", "dimension2", ...],
"metrics": ["metric1", "metric2", ...],
"filters": {
"filter1": [<filterType>, "filterValue"],
"filter2": [<filterType>, "filterValue"],
...
# filterType can be "value", "range", "exists", "missing", "custom"
},
"dateRange" : ["period"|"range", "period name", "field"] # Optional. Adds a date filter and sets extended bounds for dateHistogram dimensions. "field" specifies which field holds the date information, defaults to date.
"sort": {"field": "direction"}, # ElasticSearch sort option compatibility. direction is either "asc" or "desc"
"flags": {
"nestedDimensions": false # true/false. Defaults to false
"enableInfo": true # true/false. Defaults to true. Override to disable or enable "_info" key for each bucket
},
"format": "nested", # nested (default), raw, flattened, tabular, google_data_table, documents
"dimensionAggregate": { # Optional. Calculates a single result as an aggregation of the dimension buckets
"type": "average", #The type of calculation to perform. Supported: average, ranked, ranked_reversed
"info": 100 #Extra information for the aggregate. In the case of average, it is the expected number of buckets to be returned
},
"goals": {"my_metric_1": 10, "my_metric_2": 100}, # Optional. Map of goals for your metrics,
"context": { # any key-value pair list
"key1": "value1",
"key2": "value2"
}
}
Note: When format
is set to documents
, fulltext documents are returned rather than the stats
Example Request:
POST /api/analytics/source/page_view/stats -d
{
"dimensions": ["all", "domain"],
"metrics": ["totalViews", "totalGets", "totalPosts"],
"filters": {
"userId": ["value", 10223],
"date": ["range", {"from": "2015-04-01", "to": "2015-04-30"}]
},
"sort": { "totalViews": "asc"},
"flags": {
"nestedDimensions": false
},
"format": "nested"
}
Example Response:
{
"all": {
"totalViews": 10323,
"totalGets": 8023,
"totalPosts": 1800
},
"domain" : {
"hotels.com": {
"totalViews": 323,
"totalGets": 301,
"totalPosts": 18
},
"expedia.com": {
"totalViews": 1323,
"totalGets": 1201,
"totalPosts": 98
},
...
}
}
Route Name: revinate_analytics_bulk_stats_search
POST /api/analytics/source/{sourcename}/bulkstats
{
"queries": {
{
"dimensions": ["dimension1", "dimension2", ...],
"metrics": ["metric1", "metric2", ...],
"filters": {...}
"sort": {"field": "direction"}, # ElasticSearch sort option compatibility. direction is either "asc" or "desc"
},
{
"dimensions": ["dimension1", "dimension2", ...],
"metrics": ["metric1", "metric2", ...],
"filters": {...}
"sort": {"field": "direction"}, # ElasticSearch sort option compatibility. direction is either "asc" or "desc"
}
},
"flags": {
"nestedDimensions": false # flags are common for all queries
},
"format": "nested", # format is common for all queries
"dimensionAggregate": "average", # Optional. Supported: average, ranked, ranked_reversed
"comparator": "change" # Optional. null (default), change, percentage, index . Returns comparison between multiple queries,
"goals": {"my_metric_1": 10, "my_metric_2": 100}, # this is optional map of goals for your metrics
"context": { # any key-value pair list
"key1": "value1",
"key2": "value2"
}
}
Response Format:
{
"results":
{
"all": {
"totalViews": 10323,
"totalGets": 8023,
"totalPosts": 1800
},
"domain" : {
"hotels.com": {
"totalViews": 323,
"totalGets": 301,
"totalPosts": 18
},
"expedia.com": {
"totalViews": 1323,
"totalGets": 1201,
"totalPosts": 98
},
...
}
},
{
"all": {
"totalViews": 10323,
"totalGets": 8023,
"totalPosts": 1800
},
"domain" : {
"hotels.com": {
"totalViews": 323,
"totalGets": 301,
"totalPosts": 18
},
"expedia.com": {
"totalViews": 1323,
"totalGets": 1201,
"totalPosts": 98
},
...
}
}
},
"comparator": {
"0": {
# Comparison between the two queries
}
}
Route Name: revinate_analytics_document_search
POST /api/analytics/source/{sourcename}/documents
{
"filters": {...}
"sort": {"field": "direction"}, # ElasticSearch sort option compatibility. direction is either "asc" or "desc"
"size": 10,
"offset": 0
}
Response Format:
[
{
"device": "ios",
"browser": "chrome",
"siteId": 1,
"views": 6,
"date": "2015-08-09T01:19:14 00:00"
},
{
"device": "ios",
"browser": "opera",
"siteId": 7,
"views": 5,
"date": "2015-07-09T01:19:14 00:00"
}
]
You will need to have VirtualBox installed
brew install docker docker-machine docker-compose
docker-machine create -d virtualbox analytics-bundle
At this point you may need to reboot, because VirtualBox tends to loose network routing connectivity because routing rules for vboxnet* mysteriously disappear.
eval $(docker-machine env analytics-bundle)
composer install
docker-compose rm -f
docker-compose build
docker-compose up
This will run testsuite. And each time you change code you will have to do these three steps, because somewhy docker-compose magically caches old code in old images.
docker-compose rm -f
docker-compose build
docker-compose up
add port forwarding rules to docker-compose.yml, so, that it says:
elasticsearch:
image: elasticsearch:1.7.3
ports:
- "9200:9200"
modify you /etc/hosts adding DOCKER_HOST (which should normally be 192.168.99.100, but it is worth itself to verify by doing docker-machine env analytics-bundle
192.168.99.100 elasticsearch
and then
docker-compose up -d
phpunit
docker-compose build
docker-compose run analytics-bundle