Symfony & Elasticsearch - Jonny Schmid
@jschmid_no1
Symfony Elastic

Symfony & Elasticsearch

@jschmid_no1



getmustapp.com

This Talk

  • Dependencies
  • Set up a dev environment in 15 mins
  • Debugging tools
  • Simple queries
  • Geo bounding box query

About Elasticsearch

  • Search server written in Java
  • Based on Lucene – information retrieval library
  • JSON documents and RESTful API
  • Very popular. Amongst others: Uber, Netflix

Some Cool Features

  • Stemming
  • Fuzzy search
  • Autocomplete
  • Search alerts
  • “More like this” queries
  • Geo distance / bounding box
Elasticsearch: Github issues

Within 1 week
Elasticsearch: Github issues closed

Within 1 week
Elasticsearch: Github commits

elastic.co/guide

Outdated docs

PHP Clients

elasticsearch-php

  • Official library (since 2013)
  • Maps all queries to simple array structure

Elastica

  • First developed in 2010
  • Maps queries to objects

FOSElasticaBundle

  • Elastica for Symfony
  • Listeners for Doctrine events for automatic indexing
  • Automatically generates mappings using a serializer
Elasticsearch Elastica FOSElasticaBundle
2.2.0
2.1.1 3.1.0
1.5.2 2.1.0 3.1.8

Elasticsearch 2.x = Lots of improvements

FOSElasticaBundle BC
Fuck it, we'll install the dependencies locally

composer.json


{
  "repositories": [
    {
      "type": "vcs",
      "url": "https://github.com/foaly-nr1/FOSElasticaBundle.git"
    }
  ],
  "require": {
    "friendsofsymfony/elastica-bundle": "dev-patch-elastica-3.1.0 as 4.0@dev"
  }
}
					

$ composer update
					

This Talk

  • Dependencies
  • Set up a dev environment in 15 mins

Let's get started

Vagrant/Docker: Elasticsearch 2.1


Vagrant.configure(2) do |config|
  config.vm.define "elasticsearch" do |v|

    v.vm.box = "ubuntu/trusty64"

    v.vm.network "private_network", ip: "192.168.33.111"
    v.vm.network :forwarded_port, guest: 9200, host: 9200

    v.vm.provision "docker" do |d|
      d.images = ["elasticsearch:2.1"]
      d.run "elasticsearch", args: "-p '9200:9200'"
    end

  end
end
					

You Know, for Search


$ vagrant up
$ curl 192.168.33.111:9200
{
  "name" : "Desmond Pitt",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "2.1.1",
    "build_hash" : "40e2c53a6b6c2972b3d13846e450e66f4375bd71",
    "build_timestamp" : "2015-12-15T13:05:55Z",
    "build_snapshot" : false,
    "lucene_version" : "5.3.1"
  },
  "tagline" : "You Know, for Search"
}
					

config.yml


fos_elastica:
    clients:
        default: { host: 192.168.33.111, port: 9200 }
        		

/* @ORM\Entity */
class BlogPost
{
    /**
     * @ORM\Id
     * @ORM\Column(type="integer")
     * @ORM\GeneratedValue(strategy="AUTO")
     */
    protected $id;

    /**
     * @ORM\Column(type="string", length=250)
     */
    public $caption;

    /**
     * @ORM\Column(type="text")
     */
    public $content;

    // Getters and setters
}
					

Manual config

config.yml


fos_elastica:
    indexes:
        app:
            types:
                blog_post:
                    mappings:
                        caption: ~
                        content: ~

                    persistence:
                        driver: orm
                        model: AppBundle\Entity\BlogPost
                        provider: ~
                        listener: ~
                        finder: ~
                    

Manual config

config.yml


fos_elastica:
    indexes:
        app:
            types:
                blog_post:
                    mappings:
                        caption: ~
                        content: ~

                    persistence: &ELASTICAORM
                        driver: orm
                        model: AppBundle\Entity\BlogPost
                        provider: ~
                        listener: ~
                        finder: ~
                    

JMSSerializer

Entity.BlogPost.yml


AppBundle\Entity\BlogPost:
    exclusion_policy: ALL
    properties:
        caption:
            expose: true
            groups: [ 'elastica' ]
        content:
            expose: true
            groups: [ 'elastica' ]
            		

$ php app/console doctrine:schema:update -f
$ php app/console fos:elastica:populate
$ curl 192.168.33.111:9200/app?pretty
{
  "app" : {
    "aliases" : { },
    "mappings" : {
      "blog_post" : {
        "_meta" : {
          "model" : "AppBundle\\Entity\\BlogPost"
        },
        "properties" : {
          "caption" : {
            "type" : "string"
          },
          "content" : {
            "type" : "string"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1455619375723",
        "number_of_shards" : "5",
        "number_of_replicas" : "1",
        "uuid" : "xSt_rtJuRx2WWL65bQ0uTw",
        "version" : {
          "created" : "2020099"
        }
      }
    },
    "warmers" : { }
  }
}

					

Add some records...

...and we're good to search


$finder = $this->container->get('fos_elastica.finder.app.blog_post');
$results = $finder->find('Symfony');

return new JsonResponse($results);
					

This Talk

  • Dependencies
  • Set up a dev environment in 15 mins
  • Debugging tools

Chrome Logger

config_dev.yml


monolog:
    handlers:
        chromephp:
            type: chromephp
            level: info
            channels: [elastica]
            		

curl -XGET 'http://192.168.33.111:9200/app/blog_post/_search' -d '{
	"query": {
		"query_string": {
			"query": "Symfony"
		}
	}
}'
					

{
  "took" : 14,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.11506981,
    "hits" : [ {
      "_index" : "app",
      "_type" : "blog_post",
      "_id" : "4",
      "_score" : 0.11506981,
      "_source" : {
        "caption" : "Symfony Meetup",
        "content" : "The February meetup of #Symfonyuk takes places on Feb 16."
      }
    } ]
  }
}
					

How about filters and scoring you ask?

This Talk

  • Dependencies
  • Set up a dev environment in 15 mins
  • Debugging tools
  • Simple queries

Elastica

Deprecated in Elastica 3.x

  • Elastica\Query\Filtered
  • Elastica\Filter*

$finder = $this->container->get('fos_elastica.finder.app.blog_post');

$boolQuery = new \Elastica\Query\BoolQuery();

$fieldQuery = new \Elastica\Query\Match();
$fieldQuery->setFieldQuery('caption', 'Symfony');
$boolQuery->addShould($fieldQuery); // --> Previously matching/scoring

$tagsQuery = new \Elastica\Query\Terms();
$tagsQuery->setTerms('category', array('tech', 'sports'));
$boolQuery->addMust($tagsQuery); // --> Previously filter

$results = $finder->find($boolQuery);

return new JsonResponse($results);
					

{
  "query": {
    "bool": {
      "should": [{
        "match": {
          "caption": {
            "query": "Symfony"
          }
        }
      }],
      "must": [{
        "terms": {
          "category": [ "tech", "sports" ]
        }
      }]
    }
  }
}
					


{
  "took" : 44,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.020333186,
    "hits" : [ {
      "_index" : "app",
      "_type" : "blog_post",
      "_id" : "4",
      "_score" : 0.020333186,
      "_source" : {
        "caption" : "Symfony Meetup",
        "content" : "The February meetup of #Symfonyuk takes places on Feb 16."
      }
    } ]
  }
}
					

Controlling Relevance


fos_elastica:
    indexes:
        app:
            types:
                blog_post:
                    mappings:
                        caption:
                            boost: 1.5
                        content: ~
                    

This Talk

  • Dependencies
  • Set up a dev environment in 15 mins
  • Debugging tools
  • Simple queries
  • Geo bounding box query

Geo Functions

  • Higher relevance when closer to a point (e.g. location of user)
  • Filter results within a radius from a point
  • Filter results within a bounding box


/**
 * @ORM\Embeddable
 */
class GeoPoint
{
  /**
   * @ORM\Column(type="float", nullable=true)
   */
  public $longitude;

  /**
   * @ORM\Column(type="float", nullable=true)
   */
  public $latitude;

  public function __toString()
  {
    return $this->longitude && $this->latitude ? $this->latitude.','.$this->longitude : '';
  }
}
					

/**
 * @ORM\Entity
 */
class BlogPost
{
  // ...

  /**
   * @ORM\Embedded(class="GeoPoint")
   */
  public $geoPoint;

  public function __construct()
  {
    $this->geoPoint = new GeoPoint();
  }

  // Getters and setters
}
					

fos_elastica:
    indexes:
        app:
            types:
                blog_post:
                    mappings:
                        caption: ~
                        content: ~
                        geo_point:
                            type: geo_point
                    

Add some records...

Another debugging tool


$ curl http://192.168.33.111:9200/app/blog_post/4?pretty
{
  "_index" : "app",
  "_type" : "blog_post",
  "_id" : "4",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "caption" : "Symfony Meetup",
    "content" : "The February meetup of #Symfonyuk takes places on Feb 16.",
    "geo_point" : "-0.0873233,51.5132852"
  }
}
					

...and we're good to search


$finder = $this->container->get('fos_elastica.finder.app.blog_post');

$boolQuery = new \Elastica\Query\BoolQuery();

// (field, [top_left, bottom_right])
$geoQuery = new \Elastica\Query\GeoBoundingBox('geo_point', [
  '51.517577,-0.090372',
  '51.509047,-0.079461',
]);
$boolQuery->addMust($geoQuery);

$results = $finder->find($boolQuery);

return new JsonResponse($results);
        			

{
  "query": {
    "bool": {
      "must": [{
        "geo_bounding_box": {
          "geo_point": {
            "top_left": "51.517577,-0.090372",
            "bottom_right": "51.509047,-0.079461"
          }
        }
      }]
    }
  }
}
					

{
  "took" : 304,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "app",
      "_type" : "blog_post",
      "_id" : "4",
      "_score" : 1.0,
      "_source" : {
        "caption" : "Symfony Meetup",
        "content" : "The February meetup of #Symfonyuk takes places on Feb 16.",
        "geo_point" : "51.5132852,-0.0873233"
      }
    } ]
  }
}
					

@jschmid_no1