MongoEngine is an Object-Document Mapper, written in Python for working with MongoDB. To install it, simply run
$ pip install -U mongoengine
To get help with using MongoEngine, use the MongoEngine Users mailing list or the ever popular stackoverflow.
Yes please! We are always looking for contributions, additions and improvements.
The source is available on GitHub and contributions are always encouraged. Contributions can be as simple as minor tweaks to this documentation, the website or the core.
To contribute, fork the project on GitHub and send a pull request.
See the Changelog for a full list of changes to MongoEngine and Upgrading for upgrade information.
Note
Always read and test the upgrade documentation before putting updates live in production ;)
Download the docs in pdf or epub formats for offline reading.
This tutorial introduces MongoEngine by means of example — we will walk through how to create a simple Tumblelog application. A Tumblelog is a type of blog where posts are not constrained to being conventional text-based posts. As well as text-based entries, users may post images, links, videos, etc. For simplicity’s sake, we’ll stick to text, image and link entries in our application. As the purpose of this tutorial is to introduce MongoEngine, we’ll focus on the data-modelling side of the application, leaving out a user interface.
Before we start, make sure that a copy of MongoDB is running in an accessible location — running it locally will be easier, but if that is not an option then it may be run on a remote server. If you haven’t installed mongoengine, simply use pip to install it like so:
$ pip install mongoengine
Before we can start using MongoEngine, we need to tell it how to connect to our
instance of mongod. For this we use the connect()
function. If running locally the only argument we need to provide is the name
of the MongoDB database to use:
from mongoengine import *
connect('tumblelog')
There are lots of options for connecting to MongoDB, for more information about them see the Connecting to MongoDB guide.
MongoDB is schemaless, which means that no schema is enforced by the database — we may add and remove fields however we want and MongoDB won’t complain. This makes life a lot easier in many regards, especially when there is a change to the data model. However, defining schemata for our documents can help to iron out bugs involving incorrect types or missing fields, and also allow us to define utility methods on our documents in the same way that traditional ORMs do.
In our Tumblelog application we need to store several different types of information. We will need to have a collection of users, so that we may link posts to an individual. We also need to store our different types of posts (eg: text, image and link) in the database. To aid navigation of our Tumblelog, posts may have tags associated with them, so that the list of posts shown to the user may be limited to posts that have been assigned a specific tag. Finally, it would be nice if comments could be added to posts. We’ll start with users, as the other document models are slightly more involved.
Just as if we were using a relational database with an ORM, we need to define
which fields a User
may have, and what types of data they might store:
class User(Document):
email = StringField(required=True)
first_name = StringField(max_length=50)
last_name = StringField(max_length=50)
This looks similar to how the structure of a table would be defined in a regular ORM. The key difference is that this schema will never be passed on to MongoDB — this will only be enforced at the application level, making future changes easy to manage. Also, the User documents will be stored in a MongoDB collection rather than a table.
Now we’ll think about how to store the rest of the information. If we were using a relational database, we would most likely have a table of posts, a table of comments and a table of tags. To associate the comments with individual posts, we would put a column in the comments table that contained a foreign key to the posts table. We’d also need a link table to provide the many-to-many relationship between posts and tags. Then we’d need to address the problem of storing the specialised post-types (text, image and link). There are several ways we can achieve this, but each of them have their problems — none of them stand out as particularly intuitive solutions.
Happily mongoDB isn’t a relational database, so we’re not going to do it that
way. As it turns out, we can use MongoDB’s schemaless nature to provide us with
a much nicer solution. We will store all of the posts in one collection and
each post type will only store the fields it needs. If we later want to add
video posts, we don’t have to modify the collection at all, we just start
using the new fields we need to support video posts. This fits with the
Object-Oriented principle of inheritance nicely. We can think of
Post
as a base class, and TextPost
, ImagePost
and
LinkPost
as subclasses of Post
. In fact, MongoEngine supports
this kind of modelling out of the box — all you need do is turn on inheritance
by setting allow_inheritance
to True in the meta
:
class Post(Document):
title = StringField(max_length=120, required=True)
author = ReferenceField(User)
meta = {'allow_inheritance': True}
class TextPost(Post):
content = StringField()
class ImagePost(Post):
image_path = StringField()
class LinkPost(Post):
link_url = StringField()
We are storing a reference to the author of the posts using a
ReferenceField
object. These are similar to foreign key
fields in traditional ORMs, and are automatically translated into references
when they are saved, and dereferenced when they are loaded.
Now that we have our Post models figured out, how will we attach tags to them?
MongoDB allows us to store lists of items natively, so rather than having a
link table, we can just store a list of tags in each post. So, for both
efficiency and simplicity’s sake, we’ll store the tags as strings directly
within the post, rather than storing references to tags in a separate
collection. Especially as tags are generally very short (often even shorter
than a document’s id), this denormalisation won’t impact very strongly on the
size of our database. So let’s take a look that the code our modified
Post
class:
class Post(Document):
title = StringField(max_length=120, required=True)
author = ReferenceField(User)
tags = ListField(StringField(max_length=30))
The ListField
object that is used to define a Post’s tags
takes a field object as its first argument — this means that you can have
lists of any type of field (including lists).
Note
We don’t need to modify the specialised post types as they all
inherit from Post
.
A comment is typically associated with one post. In a relational database, to display a post with its comments, we would have to retrieve the post from the database, then query the database again for the comments associated with the post. This works, but there is no real reason to be storing the comments separately from their associated posts, other than to work around the relational model. Using MongoDB we can store the comments as a list of embedded documents directly on a post document. An embedded document should be treated no differently that a regular document; it just doesn’t have its own collection in the database. Using MongoEngine, we can define the structure of embedded documents, along with utility methods, in exactly the same way we do with regular documents:
class Comment(EmbeddedDocument):
content = StringField()
name = StringField(max_length=120)
We can then store a list of comment documents in our post document:
class Post(Document):
title = StringField(max_length=120, required=True)
author = ReferenceField(User)
tags = ListField(StringField(max_length=30))
comments = ListField(EmbeddedDocumentField(Comment))
The ReferenceField
object takes a keyword
reverse_delete_rule for handling deletion rules if the reference is deleted.
To delete all the posts if a user is deleted set the rule:
class Post(Document):
title = StringField(max_length=120, required=True)
author = ReferenceField(User, reverse_delete_rule=CASCADE)
tags = ListField(StringField(max_length=30))
comments = ListField(EmbeddedDocumentField(Comment))
See ReferenceField
for more information.
Note
MapFields and DictFields currently don’t support automatic handling of deleted references
Now that we’ve defined how our documents will be structured, let’s start adding
some documents to the database. Firstly, we’ll need to create a User
object:
ross = User(email='ross@example.com', first_name='Ross', last_name='Lawley').save()
Note
We could have also defined our user using attribute syntax:
ross = User(email='ross@example.com')
ross.first_name = 'Ross'
ross.last_name = 'Lawley'
ross.save()
Now that we’ve got our user in the database, let’s add a couple of posts:
post1 = TextPost(title='Fun with MongoEngine', author=john)
post1.content = 'Took a look at MongoEngine today, looks pretty cool.'
post1.tags = ['mongodb', 'mongoengine']
post1.save()
post2 = LinkPost(title='MongoEngine Documentation', author=ross)
post2.link_url = 'http://docs.mongoengine.com/'
post2.tags = ['mongoengine']
post2.save()
Note
If you change a field on a object that has already been saved, then
call save()
again, the document will be updated.
So now we’ve got a couple of posts in our database, how do we display them?
Each document class (i.e. any class that inherits either directly or indirectly
from Document
) has an objects
attribute, which is
used to access the documents in the database collection associated with that
class. So let’s see how we can get our posts’ titles:
for post in Post.objects:
print post.title
This will print the titles of our posts, one on each line. But What if we want
to access the type-specific data (link_url, content, etc.)? One way is simply
to use the objects
attribute of a subclass of Post
:
for post in TextPost.objects:
print post.content
Using TextPost’s objects
attribute only returns documents that were
created using TextPost
. Actually, there is a more general rule here:
the objects
attribute of any subclass of Document
only looks for documents that were created using that subclass or one of its
subclasses.
So how would we display all of our posts, showing only the information that
corresponds to each post’s specific type? There is a better way than just using
each of the subclasses individually. When we used Post
‘s
objects
attribute earlier, the objects being returned weren’t actually
instances of Post
— they were instances of the subclass of
Post
that matches the post’s type. Let’s look at how this works in
practice:
for post in Post.objects:
print post.title
print '=' * len(post.title)
if isinstance(post, TextPost):
print post.content
if isinstance(post, LinkPost):
print 'Link:', post.link_url
print
This would print the title of each post, followed by the content if it was a text post, and “Link: <url>” if it was a link post.
The objects
attribute of a Document
is actually a
QuerySet
object. This lazily queries the
database only when you need the data. It may also be filtered to narrow down
your query. Let’s adjust our query so that only posts with the tag “mongodb”
are returned:
for post in Post.objects(tags='mongodb'):
print post.title
There are also methods available on QuerySet
objects that allow different results to be returned, for example, calling
first()
on the objects
attribute will return a single document,
the first matched by the query you provide. Aggregation functions may also be
used on QuerySet
objects:
num_posts = Post.objects(tags='mongodb').count()
print 'Found %d posts with tag "mongodb"' % num_posts
If you got this far you’ve made a great start, so well done! The next step on your mongoengine journey is the full user guide, where you can learn indepth about how to use mongoengine and mongodb.
To use MongoEngine, you will need to download MongoDB and ensure it is running in an accessible location. You will also need PyMongo to use MongoEngine, but if you install MongoEngine using setuptools, then the dependencies will be handled for you.
MongoEngine is available on PyPI, so to use it you can use pip:
$ pip install mongoengine
Alternatively, if you don’t have setuptools installed, download it from PyPi and run
$ python setup.py install
To use the bleeding-edge version of MongoEngine, you can get the source from GitHub and install it as above:
$ git clone git://github.com/mongoengine/mongoengine
$ cd mongoengine
$ python setup.py install
To connect to a running instance of mongod, use the
connect()
function. The first argument is the name of the
database to connect to:
from mongoengine import connect
connect('project1')
By default, MongoEngine assumes that the mongod instance is running
on localhost on port 27017. If MongoDB is running elsewhere, you should
provide the host
and port
arguments to
connect()
:
connect('project1', host='192.168.1.35', port=12345)
If the database requires authentication, username
and password
arguments should be provided:
connect('project1', username='webapp', password='pwd123')
URI style connections are also supported – just supply the URI as
the host
to
connect()
:
connect('project1', host='mongodb://localhost/database_name')
Note
Database, username and password from URI string overrides
corresponding parameters in connect()
:
connect(
name='test',
username='user',
password='12345',
host='mongodb://admin:qwerty@localhost/production'
)
will establish connection to production
database using
admin
username and qwerty
password.
MongoEngine supports
MongoReplicaSetClient
. To use them,
please use an URI style connection and provide the replicaSet
name
in the connection kwargs.
Read preferences are supported through the connection or via individual queries by passing the read_preference
Bar.objects().read_preference(ReadPreference.PRIMARY)
Bar.objects(read_preference=ReadPreference.PRIMARY)
Multiple database support was added in MongoEngine 0.6. To use multiple
databases you can use connect()
and provide an alias name
for the connection - if no alias is provided then “default” is used.
In the background this uses register_connection()
to
store the data and you can register all aliases up front if required.
Individual documents can also support multiple databases by providing a
db_alias in their meta data. This allows DBRef
objects
to point across databases and collections. Below is an example schema, using
3 different databases to store data:
class User(Document):
name = StringField()
meta = {"db_alias": "user-db"}
class Book(Document):
name = StringField()
meta = {"db_alias": "book-db"}
class AuthorBooks(Document):
author = ReferenceField(User)
book = ReferenceField(Book)
meta = {"db_alias": "users-books-db"}
Sometimes you may want to switch the database or collection to query against for a class. For example, archiving older data into a separate database for performance reasons or writing functions that dynamically choose collections to write document to.
The switch_db
context manager allows
you to change the database alias for a given class allowing quick and easy
access the same User document across databases:
from mongoengine.context_managers import switch_db
class User(Document):
name = StringField()
meta = {"db_alias": "user-db"}
with switch_db(User, 'archive-user-db') as User:
User(name="Ross").save() # Saves the 'archive-user-db'
The switch_collection
context manager
allows you to change the collection for a given class allowing quick and easy
access the same Group document across collection:
from mongoengine.context_managers import switch_collection
class Group(Document):
name = StringField()
Group(name="test").save() # Saves in the default db
with switch_collection(Group, 'group2000') as Group:
Group(name="hello Group 2000 collection!").save() # Saves in group2000 collection
Note
Make sure any aliases have been registered with
register_connection()
or connect()
before using the context manager.
In MongoDB, a document is roughly equivalent to a row in an RDBMS. When working with relational databases, rows are stored in tables, which have a strict schema that the rows follow. MongoDB stores documents in collections rather than tables — the principal difference is that no schema is enforced at a database level.
MongoEngine allows you to define schemata for documents as this helps to reduce coding errors, and allows for utility methods to be defined on fields which may be present.
To define a schema for a document, create a class that inherits from
Document
. Fields are specified by adding field
objects as class attributes to the document class:
from mongoengine import *
import datetime
class Page(Document):
title = StringField(max_length=200, required=True)
date_modified = DateTimeField(default=datetime.datetime.now)
As BSON (the binary format for storing data in mongodb) is order dependent, documents are serialized based on their field order.
One of the benefits of MongoDb is dynamic schemas for a collection, whilst data should be planned and organised (after all explicit is better than implicit!) there are scenarios where having dynamic / expando style documents is desirable.
DynamicDocument
documents work in the same way as
Document
but any data / attributes set to them will also
be saved
from mongoengine import *
class Page(DynamicDocument):
title = StringField(max_length=200, required=True)
# Create a new page and add tags
>>> page = Page(title='Using MongoEngine')
>>> page.tags = ['mongodb', 'mongoengine']
>>> page.save()
>>> Page.objects(tags='mongoengine').count()
>>> 1
Note
There is one caveat on Dynamic Documents: fields cannot start with _
Dynamic fields are stored in creation order after any declared fields.
By default, fields are not required. To make a field mandatory, set the
required
keyword argument of a field to True
. Fields also may have
validation constraints available (such as max_length
in the example
above). Fields may also take default values, which will be used if a value is
not provided. Default values may optionally be a callable, which will be called
to retrieve the value (such as in the above example). The field types available
are as follows:
BinaryField
BooleanField
ComplexDateTimeField
DateTimeField
DecimalField
DictField
DynamicField
EmailField
EmbeddedDocumentField
FileField
FloatField
GenericEmbeddedDocumentField
GenericReferenceField
GeoPointField
ImageField
IntField
ListField
MapField
ObjectIdField
ReferenceField
SequenceField
SortedListField
StringField
URLField
UUIDField
PointField
LineStringField
PolygonField
MultiPointField
MultiLineStringField
MultiPolygonField
Each field type can be customized by keyword arguments. The following keyword arguments can be set on all fields:
db_field
(Default: None)required
(Default: False)ValidationError
will be raised when the document is
validated.default
(Default: None)A value to use when no value is set for this field.
The definition of default parameters follow the general rules on Python,
which means that some care should be taken when dealing with default mutable objects
(like in ListField
or DictField
):
class ExampleFirst(Document):
# Default an empty list
values = ListField(IntField(), default=list)
class ExampleSecond(Document):
# Default a set of values
values = ListField(IntField(), default=lambda: [1,2,3])
class ExampleDangerous(Document):
# This can make an .append call to add values to the default (and all the following objects),
# instead to just an object
values = ListField(IntField(), default=[1,2,3])
Note
Unsetting a field with a default value will revert back to the default.
unique
(Default: False)unique_with
(Default: None)primary_key
(Default: False)When True, use this field as a primary key for the collection. DictField and EmbeddedDocuments both support being the primary key for a document.
Note
If set, this field is also accessible through the pk field.
choices
(Default: None)An iterable (e.g. a list or tuple) of choices to which the value of this field should be limited.
Can be either be a nested tuples of value (stored in mongo) and a human readable key
SIZE = (('S', 'Small'),
('M', 'Medium'),
('L', 'Large'),
('XL', 'Extra Large'),
('XXL', 'Extra Extra Large'))
class Shirt(Document):
size = StringField(max_length=3, choices=SIZE)
Or a flat iterable just containing values
SIZE = ('S', 'M', 'L', 'XL', 'XXL')
class Shirt(Document):
size = StringField(max_length=3, choices=SIZE)
**kwargs
(Optional)MongoDB allows storing lists of items. To add a list of items to a
Document
, use the ListField
field
type. ListField
takes another field object as its first
argument, which specifies which type elements may be stored within the list:
class Page(Document):
tags = ListField(StringField(max_length=50))
MongoDB has the ability to embed documents within other documents. Schemata may
be defined for these embedded documents, just as they may be for regular
documents. To create an embedded document, just define a document as usual, but
inherit from EmbeddedDocument
rather than
Document
:
class Comment(EmbeddedDocument):
content = StringField()
To embed the document within another document, use the
EmbeddedDocumentField
field type, providing the embedded
document class as the first argument:
class Page(Document):
comments = ListField(EmbeddedDocumentField(Comment))
comment1 = Comment(content='Good work!')
comment2 = Comment(content='Nice article!')
page = Page(comments=[comment1, comment2])
Often, an embedded document may be used instead of a dictionary – generally
this is recommended as dictionaries don’t support validation or custom field
types. However, sometimes you will not know the structure of what you want to
store; in this situation a DictField
is appropriate:
class SurveyResponse(Document):
date = DateTimeField()
user = ReferenceField(User)
answers = DictField()
survey_response = SurveyResponse(date=datetime.now(), user=request.user)
response_form = ResponseForm(request.POST)
survey_response.answers = response_form.cleaned_data()
survey_response.save()
Dictionaries can store complex data, other dictionaries, lists, references to other objects, so are the most flexible field type available.
References may be stored to other documents in the database using the
ReferenceField
. Pass in another document class as the
first argument to the constructor, then simply assign document objects to the
field:
class User(Document):
name = StringField()
class Page(Document):
content = StringField()
author = ReferenceField(User)
john = User(name="John Smith")
john.save()
post = Page(content="Test Page")
post.author = john
post.save()
The User
object is automatically turned into a reference behind the
scenes, and dereferenced when the Page
object is retrieved.
To add a ReferenceField
that references the document
being defined, use the string 'self'
in place of the document class as the
argument to ReferenceField
‘s constructor. To reference a
document that has not yet been defined, use the name of the undefined document
as the constructor’s argument:
class Employee(Document):
name = StringField()
boss = ReferenceField('self')
profile_page = ReferenceField('ProfilePage')
class ProfilePage(Document):
content = StringField()
If you are implementing a one to many relationship via a list of references, then the references are stored as DBRefs and to query you need to pass an instance of the object to the query:
class User(Document):
name = StringField()
class Page(Document):
content = StringField()
authors = ListField(ReferenceField(User))
bob = User(name="Bob Jones").save()
john = User(name="John Smith").save()
Page(content="Test Page", authors=[bob, john]).save()
Page(content="Another Page", authors=[john]).save()
# Find all pages Bob authored
Page.objects(authors__in=[bob])
# Find all pages that both Bob and John have authored
Page.objects(authors__all=[bob, john])
# Remove Bob from the authors for a page.
Page.objects(id='...').update_one(pull__authors=bob)
# Add John to the authors for a page.
Page.objects(id='...').update_one(push__authors=john)
By default, MongoDB doesn’t check the integrity of your data, so deleting
documents that other documents still hold references to will lead to consistency
issues. Mongoengine’s ReferenceField
adds some functionality to
safeguard against these kinds of database integrity problems, providing each
reference with a delete rule specification. A delete rule is specified by
supplying the reverse_delete_rule
attributes on the
ReferenceField
definition, like this:
class ProfilePage(Document):
...
employee = ReferenceField('Employee', reverse_delete_rule=mongoengine.CASCADE)
The declaration in this example means that when an Employee
object is
removed, the ProfilePage
that references that employee is removed as
well. If a whole batch of employees is removed, all profile pages that are
linked are removed as well.
Its value can take any of the following constants:
mongoengine.DO_NOTHING
mongoengine.DENY
mongoengine.NULLIFY
mongoengine.CASCADE
mongoengine.PULL
ListField
(ReferenceField
).Warning
A safety note on setting up these delete rules! Since the delete rules are not recorded on the database level by MongoDB itself, but instead at runtime, in-memory, by the MongoEngine module, it is of the upmost importance that the module that declares the relationship is loaded BEFORE the delete is invoked.
If, for example, the Employee
object lives in the
payroll
app, and the ProfilePage
in the people
app, it is extremely important that the people
app is loaded
before any employee is removed, because otherwise, MongoEngine could
never know this relationship exists.
In Django, be sure to put all apps that have such delete rule declarations in
their models.py
in the INSTALLED_APPS
tuple.
Warning
Signals are not triggered when doing cascading updates / deletes - if this is required you must manually handle the update / delete.
A second kind of reference field also exists,
GenericReferenceField
. This allows you to reference any
kind of Document
, and hence doesn’t take a
Document
subclass as a constructor argument:
class Link(Document):
url = StringField()
class Post(Document):
title = StringField()
class Bookmark(Document):
bookmark_object = GenericReferenceField()
link = Link(url='http://hmarr.com/mongoengine/')
link.save()
post = Post(title='Using MongoEngine')
post.save()
Bookmark(bookmark_object=link).save()
Bookmark(bookmark_object=post).save()
Note
Using GenericReferenceField
s is slightly less
efficient than the standard ReferenceField
s, so if
you will only be referencing one document type, prefer the standard
ReferenceField
.
MongoEngine allows you to specify that a field should be unique across a
collection by providing unique=True
to a Field
‘s
constructor. If you try to save a document that has the same value for a unique
field as a document that is already in the database, a
NotUniqueError
will be raised. You may also specify
multi-field uniqueness constraints by using unique_with
, which may be
either a single field name, or a list or tuple of field names:
class User(Document):
username = StringField(unique=True)
first_name = StringField()
last_name = StringField(unique_with='first_name')
You can also skip the whole document validation process by setting
validate=False
when calling the save()
method:
class Recipient(Document):
name = StringField()
email = EmailField()
recipient = Recipient(name='admin', email='root@localhost')
recipient.save() # will raise a ValidationError while
recipient.save(validate=False) # won't
Document classes that inherit directly from Document
will have their own collection in the database. The name of the collection
is by default the name of the class, converted to lowercase (so in the example
above, the collection would be called page). If you need to change the name
of the collection (e.g. to use MongoEngine with an existing database), then
create a class dictionary attribute called meta
on your document, and
set collection
to the name of the collection that you want your
document class to use:
class Page(Document):
title = StringField(max_length=200, required=True)
meta = {'collection': 'cmsPage'}
A Document
may use a Capped Collection by specifying
max_documents
and max_size
in the meta
dictionary.
max_documents
is the maximum number of documents that is allowed to be
stored in the collection, and max_size
is the maximum size of the
collection in bytes. max_size
is rounded up to the next multiple of 256
by MongoDB internally and mongoengine before. Use also a multiple of 256 to
avoid confusions. If max_size
is not specified and
max_documents
is, max_size
defaults to 10485760 bytes (10MB).
The following example shows a Log
document that will be limited to
1000 entries and 2MB of disk space:
class Log(Document):
ip_address = StringField()
meta = {'max_documents': 1000, 'max_size': 2000000}
You can specify indexes on collections to make querying faster. This is done
by creating a list of index specifications called indexes
in the
meta
dictionary, where an index specification may
either be a single field name, a tuple containing multiple field names, or a
dictionary containing a full index definition.
A direction may be specified on fields by prefixing the field name with a + (for ascending) or a - sign (for descending). Note that direction only matters on multi-field indexes. Text indexes may be specified by prefixing the field name with a $. Hashed indexes may be specified by prefixing the field name with a #:
class Page(Document):
category = IntField()
title = StringField()
rating = StringField()
created = DateTimeField()
meta = {
'indexes': [
'title',
'$title', # text index
'#title', # hashed index
('title', '-rating'),
('category', '_cls'),
{
'fields': ['created'],
'expireAfterSeconds': 3600
}
]
}
If a dictionary is passed then the following options are available:
fields
(Default: None)cls
(Default: True)allow_inheritance
turned on, you can configure whether the index
should have the _cls
field added automatically to the start of the
index.sparse
(Default: False)unique
(Default: False)expireAfterSeconds
(Optional)Note
Inheritance adds extra fields indices see: Document inheritance.
There are a few top level defaults for all indexes that can be set:
class Page(Document):
title = StringField()
rating = StringField()
meta = {
'index_options': {},
'index_background': True,
'index_drop_dups': True,
'index_cls': False
}
index_options
(Optional)index_background
(Optional)index_cls
(Optional)index_drop_dups
(Optional)Note
Since MongoDB 3.0 drop_dups is not supported anymore. Raises a Warning and has no effect
Compound indexes can be created by adding the Embedded field or dictionary field name to the index definition.
Sometimes its more efficient to index parts of Embedded / dictionary fields, in this case use ‘dot’ notation to identify the value to index eg: rank.title
The best geo index for mongodb is the new “2dsphere”, which has an improved spherical model and provides better performance and more options when querying. The following fields will explicitly add a “2dsphere” index:
As “2dsphere” indexes can be part of a compound index, you may not want the
automatic index but would prefer a compound index. In this example we turn off
auto indexing and explicitly declare a compound index on location
and datetime
:
class Log(Document):
location = PointField(auto_index=False)
datetime = DateTimeField()
meta = {
'indexes': [[("location", "2dsphere"), ("datetime", 1)]]
}
Note
For MongoDB < 2.4 this is still current, however the new 2dsphere index is a big improvement over the previous 2D model - so upgrading is advised.
Geospatial indexes will be automatically created for all
GeoPointField
s
It is also possible to explicitly define geospatial indexes. This is
useful if you need to define a geospatial index on a subfield of a
DictField
or a custom field that contains a
point. To create a geospatial index you must prefix the field with the
* sign.
class Place(Document):
location = DictField()
meta = {
'indexes': [
'*location.point',
],
}
A special index type that allows you to automatically expire data from a collection after a given period. See the official ttl documentation for more information. A common usecase might be session data:
class Session(Document):
created = DateTimeField(default=datetime.now)
meta = {
'indexes': [
{'fields': ['created'], 'expireAfterSeconds': 3600}
]
}
Warning
TTL indexes happen on the MongoDB server and not in the application code, therefore no signals will be fired on document deletion. If you need signals to be fired on deletion, then you must handle the deletion of Documents in your application code.
Use mongoengine.Document.compare_indexes()
to compare actual indexes in
the database to those that your document definitions define. This is useful
for maintenance purposes and ensuring you have the correct indexes for your
schema.
A default ordering can be specified for your
QuerySet
using the ordering
attribute of
meta
. Ordering will be applied when the
QuerySet
is created, and can be overridden by
subsequent calls to order_by()
.
from datetime import datetime
class BlogPost(Document):
title = StringField()
published_date = DateTimeField()
meta = {
'ordering': ['-published_date']
}
blog_post_1 = BlogPost(title="Blog Post #1")
blog_post_1.published_date = datetime(2010, 1, 5, 0, 0 ,0)
blog_post_2 = BlogPost(title="Blog Post #2")
blog_post_2.published_date = datetime(2010, 1, 6, 0, 0 ,0)
blog_post_3 = BlogPost(title="Blog Post #3")
blog_post_3.published_date = datetime(2010, 1, 7, 0, 0 ,0)
blog_post_1.save()
blog_post_2.save()
blog_post_3.save()
# get the "first" BlogPost using default ordering
# from BlogPost.meta.ordering
latest_post = BlogPost.objects.first()
assert latest_post.title == "Blog Post #3"
# override default ordering, order BlogPosts by "published_date"
first_post = BlogPost.objects.order_by("+published_date").first()
assert first_post.title == "Blog Post #1"
If your collection is sharded, then you need to specify the shard key as a tuple,
using the shard_key
attribute of meta
.
This ensures that the shard key is sent with the query when calling the
save()
or
update()
method on an existing
Document
instance:
class LogEntry(Document):
machine = StringField()
app = StringField()
timestamp = DateTimeField()
data = StringField()
meta = {
'shard_key': ('machine', 'timestamp',)
}
To create a specialised type of a Document
you have
defined, you may subclass it and add any extra fields or methods you may need.
As this is new class is not a direct subclass of
Document
, it will not be stored in its own collection; it
will use the same collection as its superclass uses. This allows for more
convenient and efficient retrieval of related documents – all you need do is
set allow_inheritance
to True in the meta
data for a
document.:
# Stored in a collection named 'page'
class Page(Document):
title = StringField(max_length=200, required=True)
meta = {'allow_inheritance': True}
# Also stored in the collection named 'page'
class DatedPage(Page):
date = DateTimeField()
Note
From 0.8 onwards allow_inheritance
defaults
to False, meaning you must set it to True to use inheritance.
As MongoEngine no longer defaults to needing _cls
, you can quickly and
easily get working with existing data. Just define the document to match
the expected schema in your database
# Will work with data in an existing collection named 'cmsPage'
class Page(Document):
title = StringField(max_length=200, required=True)
meta = {
'collection': 'cmsPage'
}
If you have wildly varying schemas then using a
DynamicDocument
might be more appropriate, instead of
defining all possible field types.
If you use Document
and the database contains data that
isn’t defined then that data will be stored in the document._data dictionary.
If you want to add some extra functionality to a group of Document classes but
you don’t need or want the overhead of inheritance you can use the
abstract
attribute of meta
.
This won’t turn on Document inheritance but will allow you to keep your
code DRY:
class BaseDocument(Document):
meta = {
'abstract': True,
}
def check_permissions(self):
...
class User(BaseDocument):
...
Now the User class will have access to the inherited check_permissions method and won’t store any of the extra _cls information.
To create a new document object, create an instance of the relevant document class, providing values for its fields as constructor keyword arguments. You may provide values for any of the fields on the document:
>>> page = Page(title="Test Page")
>>> page.title
'Test Page'
You may also assign values to the document’s fields using standard object attribute syntax:
>>> page.title = "Example Page"
>>> page.title
'Example Page'
MongoEngine tracks changes to documents to provide efficient saving. To save
the document to the database, call the save()
method.
If the document does not exist in the database, it will be created. If it does
already exist, then any changes will be updated atomically. For example:
>>> page = Page(title="Test Page")
>>> page.save() # Performs an insert
>>> page.title = "My Page"
>>> page.save() # Performs an atomic set on the title field.
Note
Changes to documents are tracked and on the whole perform set
operations.
list_field.push(0)
— sets the resulting listdel(list_field)
— unsets whole listWith lists its preferable to use Doc.update(push__list_field=0)
as
this stops the whole list being updated — stopping any race conditions.
See also
MongoEngine allows you to create custom cleaning rules for your documents when
calling save()
. By providing a custom
clean()
method you can do any pre validation / data
cleaning.
This might be useful if you want to ensure a default value based on other document values for example:
class Essay(Document):
status = StringField(choices=('Published', 'Draft'), required=True)
pub_date = DateTimeField()
def clean(self):
"""Ensures that only published essays have a `pub_date` and
automatically sets the pub_date if published and not set"""
if self.status == 'Draft' and self.pub_date is not None:
msg = 'Draft entries should not have a publication date.'
raise ValidationError(msg)
# Set the pub_date for published items if not set.
if self.status == 'Published' and self.pub_date is None:
self.pub_date = datetime.now()
Note
Cleaning is only called if validation is turned on and when calling
save()
.
If your document contains ReferenceField
or
GenericReferenceField
objects, then by default the
save()
method will not save any changes to
those objects. If you want all references to be saved also, noting each
save is a separate query, then passing cascade
as True
to the save method will cascade any saves.
Each document in the database has a unique id. This may be accessed through the
id
attribute on Document
objects. Usually, the id
will be generated automatically by the database server when the object is save,
meaning that you may only access the id
field once a document has been
saved:
>>> page = Page(title="Test Page")
>>> page.id
>>> page.save()
>>> page.id
ObjectId('123456789abcdef000000000')
Alternatively, you may define one of your own fields to be the document’s
“primary key” by providing primary_key=True
as a keyword argument to a
field’s constructor. Under the hood, MongoEngine will use this field as the
id
; in fact id
is actually aliased to your primary key field so
you may still use id
to access the primary key if you want:
>>> class User(Document):
... email = StringField(primary_key=True)
... name = StringField()
...
>>> bob = User(email='bob@example.com', name='Bob')
>>> bob.save()
>>> bob.id == bob.email == 'bob@example.com'
True
You can also access the document’s “primary key” using the pk
field,
it’s an alias to id
:
>>> page = Page(title="Another Test Page")
>>> page.save()
>>> page.id == page.pk
True
Note
If you define your own primary key field, the field implicitly becomes
required, so a ValidationError
will be thrown if
you don’t provide it.
Document
classes have an objects
attribute, which
is used for accessing the objects in the database associated with the class.
The objects
attribute is actually a
QuerySetManager
, which creates and returns a new
QuerySet
object on access. The
QuerySet
object may be iterated over to
fetch documents from the database:
# Prints out the names of all the users in the database
for user in User.objects:
print user.name
Note
As of MongoEngine 0.8 the querysets utilise a local cache. So iterating
it multiple times will only cause a single query. If this is not the
desired behaviour you can call no_cache
(version 0.8.3+) to return a non-caching queryset.
The query may be filtered by calling the
QuerySet
object with field lookup keyword
arguments. The keys in the keyword arguments correspond to fields on the
Document
you are querying:
# This will return a QuerySet that will only iterate over users whose
# 'country' field is set to 'uk'
uk_users = User.objects(country='uk')
Fields on embedded documents may also be referred to using field lookup syntax by using a double-underscore in place of the dot in object attribute access syntax:
# This will return a QuerySet that will only iterate over pages that have
# been written by a user whose 'country' field is set to 'uk'
uk_pages = Page.objects(author__country='uk')
Note
(version 0.9.1+) if your field name is like mongodb operator name (for example
type, lte, lt...) and you want to place it at the end of lookup keyword
mongoengine automatically prepend $ to it. To avoid this use __ at the end of
your lookup keyword. For example if your field name is type
and you want to
query by this field you must use .objects(user__type__="admin")
instead of
.objects(user__type="admin")
Operators other than equality may also be used in queries — just attach the operator name to a key with a double-underscore:
# Only find users whose age is 18 or less
young_users = Users.objects(age__lte=18)
Available operators are as follows:
ne
– not equal tolt
– less thanlte
– less than or equal togt
– greater thangte
– greater than or equal tonot
– negate a standard check, may be used before other operators (e.g.
Q(age__not__mod=5)
)in
– value is in list (a list of values should be provided)nin
– value is not in list (a list of values should be provided)mod
– value % x == y
, where x
and y
are two provided valuesall
– every item in list of values provided is in arraysize
– the size of the array isexists
– value for field existsThe following operators are available as shortcuts to querying with regular expressions:
exact
– string field exactly matches valueiexact
– string field exactly matches value (case insensitive)contains
– string field contains valueicontains
– string field contains value (case insensitive)startswith
– string field starts with valueistartswith
– string field starts with value (case insensitive)endswith
– string field ends with valueiendswith
– string field ends with value (case insensitive)match
– performs an $elemMatch so you can match an entire document within an arrayThere are a few special operators for performing geographical queries.
The following were added in MongoEngine 0.8 for
PointField
,
LineStringField
and
PolygonField
:
geo_within
– check if a geometry is within a polygon. For ease of use
it accepts either a geojson geometry or just the polygon coordinates eg:
loc.objects(point__geo_within=[[[40, 5], [40, 6], [41, 6], [40, 5]]])
loc.objects(point__geo_within={"type": "Polygon",
"coordinates": [[[40, 5], [40, 6], [41, 6], [40, 5]]]})
geo_within_box
– simplified geo_within searching with a box eg:
loc.objects(point__geo_within_box=[(-125.0, 35.0), (-100.0, 40.0)])
loc.objects(point__geo_within_box=[<bottom left coordinates>, <upper right coordinates>])
geo_within_polygon
– simplified geo_within searching within a simple polygon eg:
loc.objects(point__geo_within_polygon=[[40, 5], [40, 6], [41, 6], [40, 5]])
loc.objects(point__geo_within_polygon=[ [ <x1> , <y1> ] ,
[ <x2> , <y2> ] ,
[ <x3> , <y3> ] ])
geo_within_center
– simplified geo_within the flat circle radius of a point eg:
loc.objects(point__geo_within_center=[(-125.0, 35.0), 1])
loc.objects(point__geo_within_center=[ [ <x>, <y> ] , <radius> ])
geo_within_sphere
– simplified geo_within the spherical circle radius of a point eg:
loc.objects(point__geo_within_sphere=[(-125.0, 35.0), 1])
loc.objects(point__geo_within_sphere=[ [ <x>, <y> ] , <radius> ])
geo_intersects
– selects all locations that intersect with a geometry eg:
# Inferred from provided points lists:
loc.objects(poly__geo_intersects=[40, 6])
loc.objects(poly__geo_intersects=[[40, 5], [40, 6]])
loc.objects(poly__geo_intersects=[[[40, 5], [40, 6], [41, 6], [41, 5], [40, 5]]])
# With geoJson style objects
loc.objects(poly__geo_intersects={"type": "Point", "coordinates": [40, 6]})
loc.objects(poly__geo_intersects={"type": "LineString",
"coordinates": [[40, 5], [40, 6]]})
loc.objects(poly__geo_intersects={"type": "Polygon",
"coordinates": [[[40, 5], [40, 6], [41, 6], [41, 5], [40, 5]]]})
near
– find all the locations near a given point:
loc.objects(point__near=[40, 5])
loc.objects(point__near={"type": "Point", "coordinates": [40, 5]})
You can also set the maximum and/or the minimum distance in meters as well:
loc.objects(point__near=[40, 5], point__max_distance=1000)
loc.objects(point__near=[40, 5], point__min_distance=100)
The older 2D indexes are still supported with the
GeoPointField
:
within_distance
– provide a list containing a point and a maximum
distance (e.g. [(41.342, -87.653), 5])
within_spherical_distance
– same as above but using the spherical geo model
(e.g. [(41.342, -87.653), 5/earth_radius])
near
– order the documents by how close they are to a given point
near_sphere
– Same as above but using the spherical geo model
within_box
– filter documents to those within a given bounding box (e.g.
[(35.0, -125.0), (40.0, -100.0)])
within_polygon
– filter documents to those within a given polygon (e.g.
[(41.91,-87.69), (41.92,-87.68), (41.91,-87.65), (41.89,-87.65)]).
Note
Requires Mongo Server 2.0
max_distance
– can be added to your location queries to set a maximum
distance.
min_distance
– can be added to your location queries to set a minimum
distance.
On most fields, this syntax will look up documents where the field specified
matches the given value exactly, but when the field refers to a
ListField
, a single item may be provided, in which case
lists that contain that item will be matched:
class Page(Document):
tags = ListField(StringField())
# This will match all pages that have the word 'coding' as an item in the
# 'tags' list
Page.objects(tags='coding')
It is possible to query by position in a list by using a numerical value as a
query operator. So if you wanted to find all pages whose first tag was db
,
you could use the following query:
Page.objects(tags__0='db')
If you only want to fetch part of a list eg: you want to paginate a list, then the slice operator is required:
# comments - skip 5, limit 10
Page.objects.fields(slice__comments=[5, 10])
For updating documents, if you don’t know the position in a list, you can use the $ positional operator
Post.objects(comments__by="joe").update(**{'inc__comments__$__votes': 1})
However, this doesn’t map well to the syntax so you can also use a capital S instead
Post.objects(comments__by="joe").update(inc__comments__S__votes=1)
Note
Due to Mongo, currently the $ operator only applies to the first matched item in the query.
It is possible to provide a raw PyMongo
query as a query parameter, which will
be integrated directly into the query. This is done using the __raw__
keyword argument:
Page.objects(__raw__={'tags': 'coding'})
New in version 0.4.
Just as with traditional ORMs, you may limit the number of results returned or
skip a number or results in you query.
limit()
and
skip()
and methods are available on
QuerySet
objects, but the array-slicing syntax
is preferred for achieving this:
# Only the first 5 people
users = User.objects[:5]
# All except for the first 5 people
users = User.objects[5:]
# 5 users, starting from the 10th user found
users = User.objects[10:15]
You may also index the query to retrieve a single result. If an item at that
index does not exists, an IndexError
will be raised. A shortcut for
retrieving the first result and returning None
if no result exists is
provided (first()
):
>>> # Make sure there are no users
>>> User.drop_collection()
>>> User.objects[0]
IndexError: list index out of range
>>> User.objects.first() == None
True
>>> User(name='Test User').save()
>>> User.objects[0] == User.objects.first()
True
To retrieve a result that should be unique in the collection, use
get()
. This will raise
DoesNotExist
if
no document matches the query, and
MultipleObjectsReturned
if more than one document matched the query. These exceptions are merged into
your document definitions eg: MyDoc.DoesNotExist
A variation of this method, get_or_create() existed, but it was unsafe. It could not be made safe, because there are no transactions in mongoDB. Other approaches should be investigated, to ensure you don’t accidentally duplicate data when using something similar to this method. Therefore it was deprecated in 0.8 and removed in 0.10.
By default, the objects objects
attribute on a
document returns a QuerySet
that doesn’t filter
the collection – it returns all objects. This may be changed by defining a
method on a document that modifies a queryset. The method should accept two
arguments – doc_cls
and queryset
. The first argument is the
Document
class that the method is defined on (in this
sense, the method is more like a classmethod()
than a regular method),
and the second argument is the initial queryset. The method needs to be
decorated with queryset_manager()
in order for it
to be recognised.
class BlogPost(Document):
title = StringField()
date = DateTimeField()
@queryset_manager
def objects(doc_cls, queryset):
# This may actually also be done by defining a default ordering for
# the document, but this illustrates the use of manager methods
return queryset.order_by('-date')
You don’t need to call your method objects
– you may define as many
custom manager methods as you like:
class BlogPost(Document):
title = StringField()
published = BooleanField()
@queryset_manager
def live_posts(doc_cls, queryset):
return queryset.filter(published=True)
BlogPost(title='test1', published=False).save()
BlogPost(title='test2', published=True).save()
assert len(BlogPost.objects) == 2
assert len(BlogPost.live_posts()) == 1
Should you want to add custom methods for interacting with or filtering
documents, extending the QuerySet
class may be
the way to go. To use a custom QuerySet
class on
a document, set queryset_class
to the custom class in a
Document
‘s meta
dictionary:
class AwesomerQuerySet(QuerySet):
def get_awesome(self):
return self.filter(awesome=True)
class Page(Document):
meta = {'queryset_class': AwesomerQuerySet}
# To call:
Page.objects.get_awesome()
New in version 0.4.
MongoDB provides some aggregation methods out of the box, but there are not as many as you typically get with an RDBMS. MongoEngine provides a wrapper around the built-in methods and provides some of its own, which are implemented as Javascript code that is executed on the database server.
Just as with limiting and skipping results, there is a method on
QuerySet
objects –
count()
, but there is also a more Pythonic
way of achieving this:
num_users = len(User.objects)
Even if len() is the Pythonic way of counting results, keep in mind that if you concerned about performance, count()
is the way to go since it only execute a server side count query, while len() retrieves the results, places them in cache, and finally counts them. If we compare the performance of the two operations, len() is much slower than count()
.
You may sum over the values of a specific field on documents using
sum()
:
yearly_expense = Employee.objects.sum('salary')
Note
If the field isn’t present on a document, that document will be ignored from the sum.
To get the average (mean) of a field on a collection of documents, use
average()
:
mean_age = User.objects.average('age')
As MongoDB provides native lists, MongoEngine provides a helper method to get a
dictionary of the frequencies of items in lists across an entire collection –
item_frequencies()
. An example of its use
would be generating “tag-clouds”:
class Article(Document):
tag = ListField(StringField())
# After adding some tagged articles...
tag_freqs = Article.objects.item_frequencies('tag', normalize=True)
from operator import itemgetter
top_tags = sorted(tag_freqs.items(), key=itemgetter(1), reverse=True)[:10]
There are a couple of methods to improve efficiency when querying, reducing the information returned by the query or efficient dereferencing .
Sometimes a subset of fields on a Document
is required,
and for efficiency only these should be retrieved from the database. This issue
is especially important for MongoDB, as fields may often be extremely large
(e.g. a ListField
of
EmbeddedDocument
s, which represent the comments on a
blog post. To select only a subset of fields, use
only()
, specifying the fields you want to
retrieve as its arguments. Note that if fields that are not downloaded are
accessed, their default value (or None
if no default value is provided)
will be given:
>>> class Film(Document):
... title = StringField()
... year = IntField()
... rating = IntField(default=3)
...
>>> Film(title='The Shawshank Redemption', year=1994, rating=5).save()
>>> f = Film.objects.only('title').first()
>>> f.title
'The Shawshank Redemption'
>>> f.year # None
>>> f.rating # default value
3
If you later need the missing fields, just call
reload()
on your document.
Sometimes for performance reasons you don’t want to automatically dereference
data. To turn off dereferencing of the results of a query use
no_dereference()
on the queryset like so:
post = Post.objects.no_dereference().first()
assert(isinstance(post.author, ObjectId))
You can also turn off all dereferencing for a fixed period by using the
no_dereference
context manager:
with no_dereference(Post) as Post:
post = Post.objects.first()
assert(isinstance(post.author, ObjectId))
# Outside the context manager dereferencing occurs.
assert(isinstance(post.author, User))
Sometimes calling a QuerySet
object with keyword
arguments can’t fully express the query you want to use – for example if you
need to combine a number of constraints using and and or. This is made
possible in MongoEngine through the Q
class.
A Q
object represents part of a query, and
can be initialised using the same keyword-argument syntax you use to query
documents. To build a complex query, you may combine
Q
objects using the &
(and) and |
(or)
operators. To use a Q
object, pass it in as the
first positional argument to Document.objects
when you filter it by
calling it with keyword arguments:
# Get published posts
Post.objects(Q(published=True) | Q(publish_date__lte=datetime.now()))
# Get top posts
Post.objects((Q(featured=True) & Q(hits__gte=1000)) | Q(hits__gte=5000))
Warning
You have to use bitwise operators. You cannot use or
, and
to combine queries as Q(a=a) or Q(b=b)
is not the same as
Q(a=a) | Q(b=b)
. As Q(a=a)
equates to true Q(a=a) or Q(b=b)
is
the same as Q(a=a)
.
Documents may be updated atomically by using the
update_one()
,
update()
and
modify()
methods on a
QuerySet
or
modify()
and
save()
(with save_condition
argument) on a
Document
.
There are several different “modifiers” that you may use with these methods:
set
– set a particular valueunset
– delete a particular value (since MongoDB v1.3)inc
– increment a value by a given amountdec
– decrement a value by a given amountpush
– append a value to a listpush_all
– append several values to a listpop
– remove the first or last element of a list depending on the valuepull
– remove a value from a listpull_all
– remove several values from a listadd_to_set
– add value to a list only if its not in the list alreadyThe syntax for atomic updates is similar to the querying syntax, but the modifier comes before the field, not after it:
>>> post = BlogPost(title='Test', page_views=0, tags=['database'])
>>> post.save()
>>> BlogPost.objects(id=post.id).update_one(inc__page_views=1)
>>> post.reload() # the document has been changed, so we need to reload it
>>> post.page_views
1
>>> BlogPost.objects(id=post.id).update_one(set__title='Example Post')
>>> post.reload()
>>> post.title
'Example Post'
>>> BlogPost.objects(id=post.id).update_one(push__tags='nosql')
>>> post.reload()
>>> post.tags
['database', 'nosql']
Note
If no modifier operator is specified the default will be $set
. So the following sentences are identical:
>>> BlogPost.objects(id=post.id).update(title='Example Post')
>>> BlogPost.objects(id=post.id).update(set__title='Example Post')
Note
In version 0.5 the save()
runs atomic updates
on changed documents by tracking changes to that document.
The positional operator allows you to update list items without knowing the index position, therefore making the update a single atomic operation. As we cannot use the $ syntax in keyword arguments it has been mapped to S:
>>> post = BlogPost(title='Test', page_views=0, tags=['database', 'mongo'])
>>> post.save()
>>> BlogPost.objects(id=post.id, tags='mongo').update(set__tags__S='mongodb')
>>> post.reload()
>>> post.tags
['database', 'mongodb']
Note
Currently only top level lists are handled, future versions of mongodb / pymongo plan to support nested positional operators. See The $ positional operator.
Javascript functions may be written and sent to the server for execution. The
result of this is the return value of the Javascript function. This
functionality is accessed through the
exec_js()
method on
QuerySet()
objects. Pass in a string containing a
Javascript function as the first argument.
The remaining positional arguments are names of fields that will be passed into
you Javascript function as its arguments. This allows functions to be written
that may be executed on any field in a collection (e.g. the
sum()
method, which accepts the name of
the field to sum over as its argument). Note that field names passed in in this
manner are automatically translated to the names used on the database (set
using the name
keyword argument to a field constructor).
Keyword arguments to exec_js()
are
combined into an object called options
, which is available in the
Javascript function. This may be used for defining specific parameters for your
function.
Some variables are made available in the scope of the Javascript function:
collection
– the name of the collection that corresponds to the
Document
class that is being used; this should be
used to get the Collection
object from db
in Javascript
codequery
– the query that has been generated by the
QuerySet
object; this may be passed into
the find()
method on a Collection
object in the Javascript
functionoptions
– an object containing the keyword arguments passed into
exec_js()
The following example demonstrates the intended usage of
exec_js()
by defining a function that sums
over a field on a document (this functionality is already available through
sum()
but is shown here for sake of
example):
def sum_field(document, field_name, include_negatives=True):
code = """
function(sumField) {
var total = 0.0;
db[collection].find(query).forEach(function(doc) {
var val = doc[sumField];
if (val >= 0.0 || options.includeNegatives) {
total += val;
}
});
return total;
}
"""
options = {'includeNegatives': include_negatives}
return document.objects.exec_js(code, field_name, **options)
As fields in MongoEngine may use different names in the database (set using the
db_field
keyword argument to a Field
constructor), a mechanism
exists for replacing MongoEngine field names with the database field names in
Javascript code. When accessing a field on a collection object, use
square-bracket notation, and prefix the MongoEngine field name with a tilde.
The field name that follows the tilde will be translated to the name used in
the database. Note that when referring to fields on embedded documents,
the name of the EmbeddedDocumentField
, followed by a dot,
should be used before the name of the field on the embedded document. The
following example shows how the substitutions are made:
class Comment(EmbeddedDocument):
content = StringField(db_field='body')
class BlogPost(Document):
title = StringField(db_field='doctitle')
comments = ListField(EmbeddedDocumentField(Comment), name='cs')
# Returns a list of dictionaries. Each dictionary contains a value named
# "document", which corresponds to the "title" field on a BlogPost, and
# "comment", which corresponds to an individual comment. The substitutions
# made are shown in the comments.
BlogPost.objects.exec_js("""
function() {
var comments = [];
db[collection].find(query).forEach(function(doc) {
// doc[~comments] -> doc["cs"]
var docComments = doc[~comments];
for (var i = 0; i < docComments.length; i++) {
// doc[~comments][i] -> doc["cs"][i]
var comment = doc[~comments][i];
comments.push({
// doc[~title] -> doc["doctitle"]
'document': doc[~title],
// comment[~comments.content] -> comment["body"]
'comment': comment[~comments.content]
});
}
});
return comments;
}
""")
New in version 0.4.
GridFS support comes in the form of the FileField
field
object. This field acts as a file-like object and provides a couple of
different ways of inserting and retrieving data. Arbitrary metadata such as
content type can also be stored alongside the files. In the following example,
a document is created to store details about animals, including a photo:
class Animal(Document):
genus = StringField()
family = StringField()
photo = FileField()
marmot = Animal(genus='Marmota', family='Sciuridae')
marmot_photo = open('marmot.jpg', 'rb')
marmot.photo.put(marmot_photo, content_type = 'image/jpeg')
marmot.save()
So using the FileField
is just like using any other
field. The file can also be retrieved just as easily:
marmot = Animal.objects(genus='Marmota').first()
photo = marmot.photo.read()
content_type = marmot.photo.content_type
Streaming data into a FileField
is achieved in a
slightly different manner. First, a new file must be created by calling the
new_file()
method. Data can then be written using write()
:
marmot.photo.new_file()
marmot.photo.write('some_image_data')
marmot.photo.write('some_more_image_data')
marmot.photo.close()
marmot.save()
Deleting stored files is achieved with the delete()
method:
marmot.photo.delete()
Warning
The FileField in a Document actually only stores the ID of a file in a separate GridFS collection. This means that deleting a document with a defined FileField does not actually delete the file. You must be careful to delete any files in a Document as above before deleting the Document itself.
Files can be replaced with the replace()
method. This works just like
the put()
method so even metadata can (and should) be replaced:
another_marmot = open('another_marmot.png', 'rb')
marmot.photo.replace(another_marmot, content_type='image/png')
New in version 0.5.
Note
Signal support is provided by the excellent blinker library. If you wish to enable signal support this library must be installed, though it is not required for MongoEngine to function.
Signals are found within the mongoengine.signals module. Unless specified signals receive no additional arguments beyond the sender class and document instance. Post-signals are only called if there were no exceptions raised during the processing of their related function.
Available signals include:
Document
or
EmbeddedDocument
instance, after the constructor
arguments have been collected but before any additional processing has been
done to them. (I.e. assignment of default values.) Handlers for this signal
are passed the dictionary of arguments using the values keyword argument
and may modify this dictionary prior to returning.Document
or
EmbeddedDocument
instance has been completed.save()
prior to performing
any actions.save()
after validation
has taken place but before saving.save()
after all actions
(validation, insert/update, cascades, clearing dirty flags) have completed
successfully. Passed the additional boolean keyword argument created to
indicate if the save was an insert or an update.delete()
prior to
attempting the delete operation.delete()
upon successful
deletion of the record.Document
instances when True or
simply a list of primary key values for the inserted records if False.After writing a handler function like the following:
import logging
from datetime import datetime
from mongoengine import *
from mongoengine import signals
def update_modified(sender, document):
document.modified = datetime.utcnow()
You attach the event handler to your Document
or
EmbeddedDocument
subclass:
class Record(Document):
modified = DateTimeField()
signals.pre_save.connect(update_modified)
While this is not the most elaborate document model, it does demonstrate the concepts involved. As a more complete demonstration you can also define your handlers within your subclass:
class Author(Document):
name = StringField()
@classmethod
def pre_save(cls, sender, document, **kwargs):
logging.debug("Pre Save: %s" % document.name)
@classmethod
def post_save(cls, sender, document, **kwargs):
logging.debug("Post Save: %s" % document.name)
if 'created' in kwargs:
if kwargs['created']:
logging.debug("Created")
else:
logging.debug("Updated")
signals.pre_save.connect(Author.pre_save, sender=Author)
signals.post_save.connect(Author.post_save, sender=Author)
Finally, you can also use this small decorator to quickly create a number of
signals and attach them to your Document
or
EmbeddedDocument
subclasses as class decorators:
def handler(event):
"""Signal decorator to allow use of callback functions as class decorators."""
def decorator(fn):
def apply(cls):
event.connect(fn, sender=cls)
return cls
fn.apply = apply
return fn
return decorator
Using the first example of updating a modification time the code is now much cleaner looking while still allowing manual execution of the callback:
@handler(signals.pre_save)
def update_modified(sender, document):
document.modified = datetime.utcnow()
@update_modified.apply
class Record(Document):
modified = DateTimeField()
Currently reverse_delete_rule does not trigger signals on the other part of the relationship. If this is required you must manually handle the reverse deletion.
After MongoDB 2.4 version, supports search documents by text indexes.
Use the $ prefix to set a text index, Look the declaration:
class News(Document):
title = StringField()
content = StringField()
is_active = BooleanField()
meta = {'indexes': [
{'fields': ['$title', "$content"],
'default_language': 'english',
'weights': {'title': 10, 'content': 2}
}
]}
Saving a document:
News(title="Using mongodb text search",
content="Testing text search").save()
News(title="MongoEngine 0.9 released",
content="Various improvements").save()
Next, start a text search using QuerySet.search_text
method:
document = News.objects.search_text('testing').first()
document.title # may be: "Using mongodb text search"
document = News.objects.search_text('released').first()
document.title # may be: "MongoEngine 0.9 released"
objects = News.objects.search('mongo').order_by('$text_score')
mongoengine.
connect
(db=None, alias='default', **kwargs)¶Connect to the database specified by the ‘db’ argument.
Connection settings may be provided here as well if the database is not running on the default port on localhost. If authentication is needed, provide username and password arguments as well.
Multiple databases are supported by using aliases. Provide a separate alias to connect to a different instance of mongod.
Changed in version 0.6: - added multiple database support.
mongoengine.
register_connection
(alias, name=None, host=None, port=None, read_preference=Primary(), username=None, password=None, authentication_source=None, **kwargs)¶Add a connection.
Parameters: |
|
---|
mongoengine.
Document
(*args, **values)¶The base class used for defining the structure and properties of
collections of documents stored in MongoDB. Inherit from this class, and
add fields as class attributes to define a document’s structure.
Individual documents may then be created by making instances of the
Document
subclass.
By default, the MongoDB collection used to store documents created using a
Document
subclass will be the name of the subclass
converted to lowercase. A different collection may be specified by
providing collection
to the meta
dictionary in the class
definition.
A Document
subclass may be itself subclassed, to
create a specialised version of the document that will be stored in the
same collection. To facilitate this behaviour a _cls
field is added to documents (hidden though the MongoEngine interface).
To disable this behaviour and remove the dependence on the presence of
_cls set allow_inheritance
to False
in the meta
dictionary.
A Document
may use a Capped Collection by
specifying max_documents
and max_size
in the meta
dictionary. max_documents
is the maximum number of documents that
is allowed to be stored in the collection, and max_size
is the
maximum size of the collection in bytes. max_size
is rounded up
to the next multiple of 256 by MongoDB internally and mongoengine before.
Use also a multiple of 256 to avoid confusions. If max_size
is not
specified and max_documents
is, max_size
defaults to
10485760 bytes (10MB).
Indexes may be created by specifying indexes
in the meta
dictionary. The value should be a list of field names or tuples of field
names. Index direction may be specified by prefixing the field names with
a + or - sign.
Automatic index creation can be disabled by specifying
auto_create_index
in the meta
dictionary. If this is set to
False then indexes will not be created by MongoEngine. This is useful in
production systems where index creation is performed as part of a
deployment system.
By default, _cls will be added to the start of every index (that doesn’t contain a list) if allow_inheritance is True. This can be disabled by either setting cls to False on the specific index or by setting index_cls to False on the meta dictionary for the document.
By default, any extra attribute existing in stored data but not declared
in your model will raise a FieldDoesNotExist
error.
This can be disabled by setting strict
to False
in the meta
dictionary.
Initialise a document or embedded document
Parameters: |
|
---|
cascade_save
(*args, **kwargs)¶Recursively saves any references / generic references on the document
compare_indexes
()¶Compares the indexes defined in MongoEngine with the ones existing in the database. Returns any missing/extra indexes.
create_index
(keys, background=False, **kwargs)¶Creates the given indexes if required.
Parameters: |
|
---|
delete
(**write_concern)¶Delete the Document
from the database. This
will only take effect if the document has been previously saved.
Parameters: | write_concern – Extra keyword arguments are passed down which
will be used as options for the resultant
getLastError command. For example,
save(..., write_concern={w: 2, fsync: True}, ...) will
wait until at least two servers have recorded the write and
will force an fsync on the primary server. |
---|
drop_collection
()¶Drops the entire collection associated with this
Document
type from the database.
ensure_index
(key_or_list, drop_dups=False, background=False, **kwargs)¶Ensure that the given indexes are in place. Deprecated in favour of create_index.
Parameters: |
|
---|
ensure_indexes
()¶Checks the document meta data and ensures all the indexes exist.
Global defaults can be set in the meta - see Defining documents
Note
You can disable automatic index creation by setting auto_create_index to False in the documents meta data
list_indexes
()¶Lists all of the indexes that should be created for given collection. It includes all the indexes from super- and sub-classes.
modify
(query={}, **update)¶Perform an atomic update of the document in the database and reload the document object using updated version.
Returns True if the document has been updated or False if the document in the database doesn’t match the query.
Note
All unsaved changes that have been made to the document are rejected if the method returns True.
Parameters: |
|
---|
my_metaclass
¶alias of TopLevelDocumentMetaclass
register_delete_rule
(document_cls, field_name, rule)¶This method registers the delete rules to apply when removing this object.
reload
(*fields, **kwargs)¶Reloads all attributes from the database.
Parameters: |
|
---|
New in version 0.1.2.
Changed in version 0.6: Now chainable
Changed in version 0.9: Can provide specific fields to reload
save
(force_insert=False, validate=True, clean=True, write_concern=None, cascade=None, cascade_kwargs=None, _refs=None, save_condition=None, **kwargs)¶Save the Document
to the database. If the
document already exists, it will be updated, otherwise it will be
created.
Parameters: |
|
---|
Changed in version 0.5: In existing documents it only saves changed fields using
set / unset. Saves are cascaded and any
DBRef
objects that have changes are
saved as well.
Changed in version 0.6: Added cascading saves
Changed in version 0.8: Cascade saves are optional and default to False. If you want fine grain control then you can turn off using document meta[‘cascade’] = True. Also you can pass different kwargs to the cascade save using cascade_kwargs which overwrites the existing kwargs with custom values.
Changed in version 0.8.5: Optional save_condition that only overwrites existing documents if the condition is satisfied in the current db record.
Changed in version 0.10: OperationError
exception raised if save_condition fails.
Changed in version 0.10.1: :class: save_condition failure now raises a SaveConditionError
Handles dereferencing of DBRef
objects to
a maximum depth in order to cut down the number queries to mongodb.
New in version 0.5.
switch_collection
(collection_name, keep_created=True)¶Temporarily switch the collection for a document instance.
Only really useful for archiving off data and calling save():
user = User.objects.get(id=user_id)
user.switch_collection('old-users')
user.save()
Parameters: |
|
---|
See also
Use switch_db
if you need to read from another database
switch_db
(db_alias, keep_created=True)¶Temporarily switch the database for a document instance.
Only really useful for archiving off data and calling save():
user = User.objects.get(id=user_id)
user.switch_db('archive-db')
user.save()
Parameters: |
|
---|
See also
Use switch_collection
if you need to read from another collection
to_dbref
()¶Returns an instance of DBRef
useful in
__raw__ queries.
mongoengine.
EmbeddedDocument
(*args, **kwargs)¶A Document
that isn’t stored in its own
collection. EmbeddedDocument
s should be used as
fields on Document
s through the
EmbeddedDocumentField
field type.
A EmbeddedDocument
subclass may be itself subclassed,
to create a specialised version of the embedded document that will be
stored in the same collection. To facilitate this behaviour a _cls
field is added to documents (hidden though the MongoEngine interface).
To disable this behaviour and remove the dependence on the presence of
_cls set allow_inheritance
to False
in the meta
dictionary.
my_metaclass
¶alias of DocumentMetaclass
mongoengine.
DynamicDocument
(*args, **values)¶A Dynamic Document class allowing flexible, expandable and uncontrolled
schemas. As a Document
subclass, acts in the same
way as an ordinary document but has expando style properties. Any data
passed or set against the DynamicDocument
that is
not a field is automatically converted into a
DynamicField
and data can be attributed to that
field.
Note
There is one caveat on Dynamic Documents: fields cannot start with _
Initialise a document or embedded document
Parameters: |
|
---|
my_metaclass
¶alias of TopLevelDocumentMetaclass
mongoengine.
DynamicEmbeddedDocument
(*args, **kwargs)¶A Dynamic Embedded Document class allowing flexible, expandable and
uncontrolled schemas. See DynamicDocument
for more
information about dynamic documents.
my_metaclass
¶alias of DocumentMetaclass
mongoengine.document.
MapReduceDocument
(document, collection, key, value)¶A document returned from a map/reduce query.
Parameters: |
|
---|
New in version 0.3.
object
¶Lazy-load the object referenced by self.key
. self.key
should be the primary_key
.
mongoengine.
ValidationError
(message='', **kwargs)¶Validation exception.
May represent an error validating a field or a document containing fields with validation errors.
Variables: | errors – A dictionary of errors for fields within this document or list, or None if the error is for an individual field. |
---|
to_dict
()¶Returns a dictionary of all errors within a document
Keys are field names or list indices and values are the validation error messages, or a nested dictionary of errors for an embedded document or list.
mongoengine.
FieldDoesNotExist
¶Raised when trying to set a field
not declared in a Document
or an EmbeddedDocument
.
To avoid this behavior on data loading,
you should the strict
to False
in the meta
dictionnary.
mongoengine.context_managers.
switch_db
(cls, db_alias)¶switch_db alias context manager.
Example
# Register connections
register_connection('default', 'mongoenginetest')
register_connection('testdb-1', 'mongoenginetest2')
class Group(Document):
name = StringField()
Group(name="test").save() # Saves in the default db
with switch_db(Group, 'testdb-1') as Group:
Group(name="hello testdb!").save() # Saves in testdb-1
Construct the switch_db context manager
Parameters: |
|
---|
mongoengine.context_managers.
switch_collection
(cls, collection_name)¶switch_collection alias context manager.
Example
class Group(Document):
name = StringField()
Group(name="test").save() # Saves in the default db
with switch_collection(Group, 'group1') as Group:
Group(name="hello testdb!").save() # Saves in group1 collection
Construct the switch_collection context manager
Parameters: |
|
---|
mongoengine.context_managers.
no_dereference
(cls)¶no_dereference context manager.
Turns off all dereferencing in Documents for the duration of the context manager:
with no_dereference(Group) as Group:
Group.objects.find()
Construct the no_dereference context manager.
Parameters: | cls – the class to turn dereferencing off on |
---|
mongoengine.context_managers.
query_counter
¶Query_counter context manager to get the number of queries.
Construct the query_counter.
mongoengine.queryset.
QuerySet
(document, collection)¶The default queryset, that builds queries and handles a set of results returned from a query.
Wraps a MongoDB cursor, providing Document
objects as
the results.
__call__
(q_obj=None, class_check=True, read_preference=None, **query)¶Filter the selected documents by calling the
QuerySet
with a query.
Parameters: |
|
---|
aggregate
(*pipeline, **kwargs)¶Perform a aggregate function based in your queryset params :param pipeline: list of aggregation commands, see: http://docs.mongodb.org/manual/core/aggregation-pipeline/
New in version 0.9.
aggregate_average
(field)¶Average over the values of the specified field.
Parameters: | field – the field to average over; use dot-notation to refer to embedded document fields |
---|
This method is more performant than the regular average, because it uses the aggregation framework instead of map-reduce.
aggregate_sum
(field)¶Sum over the values of the specified field.
Parameters: | field – the field to sum over; use dot-notation to refer to embedded document fields |
---|
This method is more performant than the regular sum, because it uses the aggregation framework instead of map-reduce.
all
()¶Returns all documents.
all_fields
()¶Include all fields. Reset all previously calls of .only() or .exclude().
post = BlogPost.objects.exclude("comments").all_fields()
New in version 0.5.
as_pymongo
(coerce_types=False)¶Instead of returning Document instances, return raw values from pymongo.
Parameters: | coerce_types – Field types (if applicable) would be use to coerce types. |
---|
average
(field)¶Average over the values of the specified field.
Parameters: | field – the field to average over; use dot-notation to refer to embedded document fields |
---|
Changed in version 0.5: - updated to map_reduce as db.eval doesnt work with sharding.
clone_into
(cls)¶Creates a copy of the current
BaseQuerySet
into another child class
count
(with_limit_and_skip=False)¶Count the selected elements in the query.
Parameters: | (optional) (with_limit_and_skip) – take any limit() or
skip() that has been applied to this cursor into account when
getting the count |
---|
create
(**kwargs)¶Create new object. Returns the saved object instance.
New in version 0.4.
delete
(write_concern=None, _from_doc_delete=False, cascade_refs=None)¶Delete the documents matched by the query.
Parameters: |
|
---|
:returns number of deleted documents
distinct
(field)¶Return a list of distinct values for a given field.
Parameters: | field – the field to select distinct values from |
---|
Note
This is a command and won’t take ordering or limit into account.
New in version 0.4.
Changed in version 0.5: - Fixed handling references
Changed in version 0.6: - Improved db_field refrence handling
ensure_index
(**kwargs)¶Deprecated use Document.ensure_index()
exclude
(*fields)¶Opposite to .only(), exclude some document’s fields.
post = BlogPost.objects(...).exclude("comments")
Note
exclude() is chainable and will perform a union :: So with the following it will exclude both: title and author.name:
post = BlogPost.objects.exclude("title").exclude("author.name")
all_fields()
will reset any
field filters.
Parameters: | fields – fields to exclude |
---|
New in version 0.5.
exec_js
(code, *fields, **options)¶Execute a Javascript function on the server. A list of fields may be
provided, which will be translated to their correct names and supplied
as the arguments to the function. A few extra variables are added to
the function’s scope: collection
, which is the name of the
collection in use; query
, which is an object representing the
current query; and options
, which is an object containing any
options specified as keyword arguments.
As fields in MongoEngine may use different names in the database (set
using the db_field
keyword argument to a Field
constructor), a mechanism exists for replacing MongoEngine field names
with the database field names in Javascript code. When accessing a
field, use square-bracket notation, and prefix the MongoEngine field
name with a tilde (~).
Parameters: |
|
---|
explain
(format=False)¶Return an explain plan record for the
QuerySet
‘s cursor.
Parameters: | format – format the plan before returning it |
---|
fields
(_only_called=False, **kwargs)¶Manipulate how you load this document’s fields. Used by .only() and .exclude() to manipulate which fields to retrieve. Fields also allows for a greater level of control for example:
Retrieving a Subrange of Array Elements:
You can use the $slice operator to retrieve a subrange of elements in an array. For example to get the first 5 comments:
post = BlogPost.objects(...).fields(slice__comments=5)
Parameters: | kwargs – A dictionary identifying what to include |
---|
New in version 0.5.
filter
(*q_objs, **query)¶An alias of __call__()
first
()¶Retrieve the first object matching the query.
from_json
(json_data)¶Converts json data to unsaved objects
get
(*q_objs, **query)¶Retrieve the the matching object raising
MultipleObjectsReturned
or
DocumentName.MultipleObjectsReturned exception if multiple results
and DoesNotExist
or
DocumentName.DoesNotExist if no results are found.
New in version 0.3.
hint
(index=None)¶Added ‘hint’ support, telling Mongo the proper index to use for the query.
Judicious use of hints can greatly improve query performance. When doing a query on multiple fields (at least one of which is indexed) pass the indexed field as a hint to the query.
Hinting will not do anything if the corresponding index does not exist. The last hint applied to this cursor takes precedence over all others.
New in version 0.5.
in_bulk
(object_ids)¶Retrieve a set of documents by their ids.
Parameters: | object_ids – a list or tuple of ObjectId s |
---|---|
Return type: | dict of ObjectIds as keys and collection-specific Document subclasses as values. |
New in version 0.3.
insert
(doc_or_docs, load_bulk=True, write_concern=None)¶bulk insert documents
Parameters: |
|
---|
By default returns document instances, set load_bulk
to False to
return just ObjectIds
New in version 0.5.
item_frequencies
(field, normalize=False, map_reduce=True)¶Returns a dictionary of all items present in a field across the whole queried set of documents, and their corresponding frequency. This is useful for generating tag clouds, or searching documents.
Note
Can only do direct simple mappings and cannot map across
ReferenceField
or
GenericReferenceField
for more complex
counting a manual map reduce call would is required.
If the field is a ListField
, the items within
each list will be counted individually.
Parameters: |
|
---|
Changed in version 0.5: defaults to map_reduce and can handle embedded document lookups
limit
(n)¶Limit the number of returned documents to n. This may also be
achieved using array-slicing syntax (e.g. User.objects[:5]
).
Parameters: | n – the maximum number of objects to return |
---|
map_reduce
(map_f, reduce_f, output, finalize_f=None, limit=None, scope=None)¶Perform a map/reduce query using the current query spec
and ordering. While map_reduce
respects QuerySet
chaining,
it must be the last call made, as it does not return a maleable
QuerySet
.
See the test_map_reduce()
and test_map_advanced()
tests in tests.queryset.QuerySetTest
for usage examples.
Parameters: |
|
---|
Returns an iterator yielding
MapReduceDocument
.
Note
Map/Reduce changed in server version >= 1.7.4. The PyMongo
map_reduce()
helper requires
PyMongo version >= 1.11.
Changed in version 0.5: - removed keep_temp
keyword argument, which was only relevant
for MongoDB server versions older than 1.7.4
New in version 0.3.
max_time_ms
(ms)¶Wait ms milliseconds before killing the query on the server
Parameters: | ms – the number of milliseconds before killing the query on the server |
---|
modify
(upsert=False, full_response=False, remove=False, new=False, **update)¶Update and return the updated document.
Returns either the document before or after modification based on new
parameter. If no documents match the query and upsert is false,
returns None
. If upserting and new is false, returns None
.
If the full_response parameter is True
, the return value will be
the entire response object from the server, including the ‘ok’ and
‘lastErrorObject’ fields, rather than just the modified document.
This is useful mainly because the ‘lastErrorObject’ document holds
information about the command’s execution.
Parameters: |
|
---|
New in version 0.9.
no_cache
()¶Convert to a non_caching queryset
New in version 0.8.3: Convert to non caching queryset
no_dereference
()¶Turn off any dereferencing for the results of this queryset.
no_sub_classes
()¶Only return instances of this document and not any inherited documents
none
()¶Helper that just returns a list
only
(*fields)¶Load only a subset of this document’s fields.
post = BlogPost.objects(...).only("title", "author.name")
Note
only() is chainable and will perform a union :: So with the following it will fetch both: title and author.name:
post = BlogPost.objects.only("title").only("author.name")
all_fields()
will reset any
field filters.
Parameters: | fields – fields to include |
---|
New in version 0.3.
Changed in version 0.5: - Added subfield support
order_by
(*keys)¶Order the QuerySet
by the keys. The
order may be specified by prepending each of the keys by a + or a -.
Ascending order is assumed.
Parameters: | keys – fields to order the query results by; keys may be prefixed with + or - to determine the ordering direction |
---|
read_preference
(read_preference)¶Change the read_preference when querying.
Parameters: | read_preference – override ReplicaSetConnection-level preference. |
---|
rewind
()¶Rewind the cursor to its unevaluated state.
New in version 0.3.
scalar
(*fields)¶Instead of returning Document instances, return either a specific value or a tuple of values in order.
Can be used along with
no_dereference()
to turn off
dereferencing.
Note
This effects all results and can be unset by calling
scalar
without arguments. Calls only
automatically.
Parameters: | fields – One or more fields to return instead of a Document. |
---|
search_text
(text, language=None)¶Start a text search, using text indexes. Require: MongoDB server version 2.6+.
Parameters: | language – The language that determines the list of stop words for the search and the rules for the stemmer and tokenizer. If not specified, the search uses the default language of the index. For supported languages, see Text Search Languages <http://docs.mongodb.org/manual/reference/text-search-languages/#text-search-languages>. |
---|
Handles dereferencing of DBRef
objects or
ObjectId
a maximum depth in order to cut down
the number queries to mongodb.
New in version 0.5.
skip
(n)¶Skip n documents before returning the results. This may also be
achieved using array-slicing syntax (e.g. User.objects[5:]
).
Parameters: | n – the number of objects to skip before returning results |
---|
slave_okay
(enabled)¶Enable or disable the slave_okay when querying.
Parameters: | enabled – whether or not the slave_okay is enabled |
---|
Deprecated since version Ignored: with PyMongo 3+
snapshot
(enabled)¶Enable or disable snapshot mode when querying.
Parameters: | enabled – whether or not snapshot mode is enabled |
---|
..versionchanged:: 0.5 - made chainable .. deprecated:: Ignored with PyMongo 3+
sum
(field)¶Sum over the values of the specified field.
Parameters: | field – the field to sum over; use dot-notation to refer to embedded document fields |
---|
Changed in version 0.5: - updated to map_reduce as db.eval doesnt work with sharding.
timeout
(enabled)¶Enable or disable the default mongod timeout when querying.
Parameters: | enabled – whether or not the timeout is used |
---|
..versionchanged:: 0.5 - made chainable
to_json
(*args, **kwargs)¶Converts a queryset to JSON
update
(upsert=False, multi=True, write_concern=None, full_result=False, **update)¶Perform an atomic update on the fields matched by the query.
Parameters: |
|
---|
New in version 0.2.
update_one
(upsert=False, write_concern=None, **update)¶Perform an atomic update on the fields of the first document matched by the query.
Parameters: |
|
---|
New in version 0.2.
upsert_one
(write_concern=None, **update)¶Overwrite or add the first document matched by the query.
Parameters: |
|
---|
:returns the new or overwritten document
New in version 0.10.2.
using
(alias)¶This method is for controlling which database the QuerySet will be evaluated against if you are using more than one database.
Parameters: | alias – The database alias |
---|
New in version 0.9.
values_list
(*fields)¶An alias for scalar
where
(where_clause)¶Filter QuerySet
results with a $where
clause (a Javascript
expression). Performs automatic field name substitution like
mongoengine.queryset.Queryset.exec_js()
.
Note
When using this mode of query, the database will call your function, or evaluate your predicate clause, for each object in the collection.
New in version 0.5.
with_id
(object_id)¶Retrieve the object matching the id provided. Uses object_id only and raises InvalidQueryError if a filter has been applied. Returns None if no document exists with that id.
Parameters: | object_id – the value for the id of the document to look up |
---|
Changed in version 0.6: Raises InvalidQueryError if filter has been set
mongoengine.queryset.
QuerySetNoCache
(document, collection)¶A non caching QuerySet
__call__
(q_obj=None, class_check=True, read_preference=None, **query)¶Filter the selected documents by calling the
QuerySet
with a query.
Parameters:
- q_obj – a
Q
object to be used in the query; theQuerySet
is filtered multiple times with differentQ
objects, only the last one will be used- class_check – If set to False bypass class name check when querying collection
- read_preference – if set, overrides connection-level read_preference from ReplicaSetConnection.
- query – Django-style query keyword arguments
cache
()¶Convert to a caching queryset
New in version 0.8.3: Convert to caching queryset
mongoengine.queryset.
queryset_manager
(func)¶Decorator that allows you to define custom QuerySet managers on
Document
classes. The manager must be a function that
accepts a Document
class as its first argument, and a
QuerySet
as its second argument. The method
function should return a QuerySet
, probably
the same one that was passed in, but modified in some way.
mongoengine.base.fields.
BaseField
(db_field=None, name=None, required=False, default=None, unique=False, unique_with=None, primary_key=False, validation=None, choices=None, null=False, sparse=False, **kwargs)¶A base class for fields in a MongoDB document. Instances of this class may be added to subclasses of Document to define a document’s schema.
Changed in version 0.5: - added verbose and help text
Parameters: |
|
---|
mongoengine.fields.
StringField
(regex=None, max_length=None, min_length=None, **kwargs)¶A unicode string field.
mongoengine.fields.
URLField
(verify_exists=False, url_regex=None, schemes=None, **kwargs)¶A field that validates input as an URL.
New in version 0.3.
mongoengine.fields.
EmailField
(regex=None, max_length=None, min_length=None, **kwargs)¶A field that validates input as an E-Mail-Address.
New in version 0.4.
mongoengine.fields.
IntField
(min_value=None, max_value=None, **kwargs)¶An 32-bit integer field.
mongoengine.fields.
LongField
(min_value=None, max_value=None, **kwargs)¶An 64-bit integer field.
mongoengine.fields.
FloatField
(min_value=None, max_value=None, **kwargs)¶An floating point number field.
mongoengine.fields.
DecimalField
(min_value=None, max_value=None, force_string=False, precision=2, rounding='ROUND_HALF_UP', **kwargs)¶A fixed-point decimal number field.
Changed in version 0.8.
New in version 0.3.
Parameters: |
|
---|
mongoengine.fields.
BooleanField
(db_field=None, name=None, required=False, default=None, unique=False, unique_with=None, primary_key=False, validation=None, choices=None, null=False, sparse=False, **kwargs)¶A boolean field type.
New in version 0.1.2.
Parameters: |
|
---|
mongoengine.fields.
DateTimeField
(db_field=None, name=None, required=False, default=None, unique=False, unique_with=None, primary_key=False, validation=None, choices=None, null=False, sparse=False, **kwargs)¶A datetime field.
Uses the python-dateutil library if available alternatively use time.strptime to parse the dates. Note: python-dateutil’s parser is fully featured and when installed you can utilise it to convert varying types of date formats into valid python datetime objects.
ComplexDateTimeField
if you
need accurate microsecond support.Parameters: |
|
---|
mongoengine.fields.
ComplexDateTimeField
(separator=', ', **kwargs)¶ComplexDateTimeField handles microseconds exactly instead of rounding like DateTimeField does.
Derives from a StringField so you can do gte and lte filtering by using lexicographical comparison when filtering / sorting strings.
The stored string has the following format:
YYYY,MM,DD,HH,MM,SS,NNNNNN
Where NNNNNN is the number of microseconds of the represented datetime. The , as the separator can be easily modified by passing the separator keyword when initializing the field.
New in version 0.5.
mongoengine.fields.
EmbeddedDocumentField
(document_type, **kwargs)¶An embedded document field - with a declared document_type.
Only valid values are subclasses of EmbeddedDocument
.
mongoengine.fields.
GenericEmbeddedDocumentField
(db_field=None, name=None, required=False, default=None, unique=False, unique_with=None, primary_key=False, validation=None, choices=None, null=False, sparse=False, **kwargs)¶A generic embedded document field - allows any
EmbeddedDocument
to be stored.
Only valid values are subclasses of EmbeddedDocument
.
Note
You can use the choices param to limit the acceptable EmbeddedDocument types
Parameters: |
|
---|
mongoengine.fields.
DynamicField
(db_field=None, name=None, required=False, default=None, unique=False, unique_with=None, primary_key=False, validation=None, choices=None, null=False, sparse=False, **kwargs)¶A truly dynamic field type capable of handling different and varying types of data.
Used by DynamicDocument
to handle dynamic data
Parameters: |
|
---|
mongoengine.fields.
ListField
(field=None, **kwargs)¶A list field that wraps a standard field, allowing multiple instances of the field to be used as a list in the database.
If using with ReferenceFields see: One to Many with ListFields
Note
Required means it cannot be empty - as the default for ListFields is []
mongoengine.fields.
EmbeddedDocumentListField
(document_type, **kwargs)¶A ListField
designed specially to hold a list of
embedded documents to provide additional query helpers.
Note
The only valid list values are subclasses of
EmbeddedDocument
.
New in version 0.9.
Parameters: |
|
---|
mongoengine.fields.
SortedListField
(field, **kwargs)¶A ListField that sorts the contents of its list before writing to the database in order to ensure that a sorted list is always retrieved.
Warning
There is a potential race condition when handling lists. If you set / save the whole list then other processes trying to save the whole list as well could overwrite changes. The safest way to append to a list is to perform a push operation.
New in version 0.4.
Changed in version 0.6: - added reverse keyword
mongoengine.fields.
DictField
(basecls=None, field=None, *args, **kwargs)¶A dictionary field that wraps a standard Python dictionary. This is similar to an embedded document, but the structure is not defined.
Note
Required means it cannot be empty - as the default for DictFields is {}
New in version 0.3.
Changed in version 0.5: - Can now handle complex / varying types of data
mongoengine.fields.
MapField
(field=None, *args, **kwargs)¶A field that maps a name to a specified field type. Similar to a DictField, except the ‘value’ of each item must match the specified field type.
New in version 0.5.
mongoengine.fields.
ReferenceField
(document_type, dbref=False, reverse_delete_rule=0, **kwargs)¶A reference to a document that will be automatically dereferenced on access (lazily).
Use the reverse_delete_rule to handle what should happen if the document the field is referencing is deleted. EmbeddedDocuments, DictFields and MapFields does not support reverse_delete_rule and an InvalidDocumentError will be raised if trying to set on one of these Document / Field types.
The options are:
- DO_NOTHING (0) - don’t do anything (default).
- NULLIFY (1) - Updates the reference to null.
- CASCADE (2) - Deletes the documents associated with the reference.
- DENY (3) - Prevent the deletion of the reference object.
- PULL (4) - Pull the reference from a
ListField
of references
Alternative syntax for registering delete rules (useful when implementing bi-directional delete rules)
class Bar(Document):
content = StringField()
foo = ReferenceField('Foo')
Bar.register_delete_rule(Foo, 'bar', NULLIFY)
Note
reverse_delete_rule does not trigger pre / post delete signals to be triggered.
Changed in version 0.5: added reverse_delete_rule
Initialises the Reference Field.
Parameters: |
|
---|
Note
A reference to an abstract document type is always stored as a
DBRef
, regardless of the value of dbref.
mongoengine.fields.
GenericReferenceField
(*args, **kwargs)¶A reference to any Document
subclass
that will be automatically dereferenced on access (lazily).
Note
New in version 0.3.
mongoengine.fields.
CachedReferenceField
(document_type, fields=[], auto_sync=True, **kwargs)¶A referencefield with cache fields to purpose pseudo-joins
New in version 0.9.
Initialises the Cached Reference Field.
Parameters: |
|
---|
mongoengine.fields.
BinaryField
(max_bytes=None, **kwargs)¶A binary data field.
mongoengine.fields.
FileField
(db_alias='default', collection_name='fs', **kwargs)¶A GridFS storage field.
New in version 0.4.
Changed in version 0.5: added optional size param for read
Changed in version 0.6: added db_alias for multidb support
mongoengine.fields.
ImageField
(size=None, thumbnail_size=None, collection_name='images', **kwargs)¶A Image File storage field.
New in version 0.6.
mongoengine.fields.
SequenceField
(collection_name=None, db_alias=None, sequence_name=None, value_decorator=None, *args, **kwargs)¶Note
Although traditional databases often use increasing sequence numbers for primary keys. In MongoDB, the preferred approach is to use Object IDs instead. The concept is that in a very large cluster of machines, it is easier to create an object ID than have global, uniformly increasing sequence numbers.
Parameters: |
|
---|
Use any callable as value_decorator to transform calculated counter into any value suitable for your needs, e.g. string or hexadecimal representation of the default integer counter value.
Note
In case the counter is defined in the abstract document, it will be common to all inherited documents and the default sequence name will be the class name of the abstract document.
New in version 0.5.
Changed in version 0.8: added value_decorator
mongoengine.fields.
ObjectIdField
(db_field=None, name=None, required=False, default=None, unique=False, unique_with=None, primary_key=False, validation=None, choices=None, null=False, sparse=False, **kwargs)¶A field wrapper around MongoDB’s ObjectIds.
Parameters: |
|
---|
mongoengine.fields.
UUIDField
(binary=True, **kwargs)¶A UUID field.
New in version 0.6.
Store UUID data in the database
Parameters: | binary – if False store as a string. |
---|
Changed in version 0.8.0.
Changed in version 0.6.19.
mongoengine.fields.
GeoPointField
(db_field=None, name=None, required=False, default=None, unique=False, unique_with=None, primary_key=False, validation=None, choices=None, null=False, sparse=False, **kwargs)¶A list storing a longitude and latitude coordinate.
Note
this represents a generic point in a 2D plane and a legacy way of representing a geo point. It admits 2d indexes but not “2dsphere” indexes in MongoDB > 2.4 which are more natural for modeling geospatial points. See Geospatial indexes
New in version 0.4.
Parameters: |
|
---|
mongoengine.fields.
PointField
(auto_index=True, *args, **kwargs)¶A GeoJSON field storing a longitude and latitude coordinate.
The data is represented as:
{ "type" : "Point" ,
"coordinates" : [x, y]}
You can either pass a dict with the full information or a list to set the value.
Requires mongodb >= 2.4
New in version 0.8.
Parameters: | auto_index (bool) – Automatically create a “2dsphere” index. Defaults to True. |
---|
mongoengine.fields.
LineStringField
(auto_index=True, *args, **kwargs)¶A GeoJSON field storing a line of longitude and latitude coordinates.
The data is represented as:
{ "type" : "LineString" ,
"coordinates" : [[x1, y1], [x1, y1] ... [xn, yn]]}
You can either pass a dict with the full information or a list of points.
Requires mongodb >= 2.4
New in version 0.8.
Parameters: | auto_index (bool) – Automatically create a “2dsphere” index. Defaults to True. |
---|
mongoengine.fields.
PolygonField
(auto_index=True, *args, **kwargs)¶A GeoJSON field storing a polygon of longitude and latitude coordinates.
The data is represented as:
{ "type" : "Polygon" ,
"coordinates" : [[[x1, y1], [x1, y1] ... [xn, yn]],
[[x1, y1], [x1, y1] ... [xn, yn]]}
You can either pass a dict with the full information or a list of LineStrings. The first LineString being the outside and the rest being holes.
Requires mongodb >= 2.4
New in version 0.8.
Parameters: | auto_index (bool) – Automatically create a “2dsphere” index. Defaults to True. |
---|
mongoengine.fields.
MultiPointField
(auto_index=True, *args, **kwargs)¶A GeoJSON field storing a list of Points.
The data is represented as:
{ "type" : "MultiPoint" ,
"coordinates" : [[x1, y1], [x2, y2]]}
You can either pass a dict with the full information or a list to set the value.
Requires mongodb >= 2.6
New in version 0.9.
Parameters: | auto_index (bool) – Automatically create a “2dsphere” index. Defaults to True. |
---|
mongoengine.fields.
MultiLineStringField
(auto_index=True, *args, **kwargs)¶A GeoJSON field storing a list of LineStrings.
The data is represented as:
{ "type" : "MultiLineString" ,
"coordinates" : [[[x1, y1], [x1, y1] ... [xn, yn]],
[[x1, y1], [x1, y1] ... [xn, yn]]]}
You can either pass a dict with the full information or a list of points.
Requires mongodb >= 2.6
New in version 0.9.
Parameters: | auto_index (bool) – Automatically create a “2dsphere” index. Defaults to True. |
---|
mongoengine.fields.
MultiPolygonField
(auto_index=True, *args, **kwargs)¶A GeoJSON field storing list of Polygons.
The data is represented as:
{ "type" : "MultiPolygon" ,
"coordinates" : [[
[[x1, y1], [x1, y1] ... [xn, yn]],
[[x1, y1], [x1, y1] ... [xn, yn]]
], [
[[x1, y1], [x1, y1] ... [xn, yn]],
[[x1, y1], [x1, y1] ... [xn, yn]]
]
}
You can either pass a dict with the full information or a list of Polygons.
Requires mongodb >= 2.6
New in version 0.9.
Parameters: | auto_index (bool) – Automatically create a “2dsphere” index. Defaults to True. |
---|
mongoengine.fields.
GridFSError
¶mongoengine.fields.
GridFSProxy
(grid_id=None, key=None, instance=None, db_alias='default', collection_name='fs')¶Proxy object to handle writing and reading of files to and from GridFS
New in version 0.4.
Changed in version 0.5: - added optional size param to read
Changed in version 0.6: - added collection name param
mongoengine.fields.
ImageGridFsProxy
(grid_id=None, key=None, instance=None, db_alias='default', collection_name='fs')¶Proxy for ImageField
versionadded: 0.6
mongoengine.fields.
ImproperlyConfigured
¶New in version 0.9.
Additional queries for Embedded Documents are available when using the
EmbeddedDocumentListField
to store a list of embedded
documents.
A list of embedded documents is returned as a special list with the following methods:
mongoengine.base.datastructures.
EmbeddedDocumentList
(list_items, instance, name)¶count
()¶The number of embedded documents in the list.
Returns: | The length of the list, equivalent to the result of len() . |
---|
create
(**values)¶Creates a new embedded document and saves it to the database.
Note
The embedded document changes are not automatically saved to the database after calling this method.
Parameters: | values – A dictionary of values for the embedded document. |
---|---|
Returns: | The new embedded document instance. |
delete
()¶Deletes the embedded documents from the database.
Note
The embedded document changes are not automatically saved to the database after calling this method.
Returns: | The number of entries deleted. |
---|
exclude
(**kwargs)¶Filters the list by excluding embedded documents with the given keyword arguments.
Parameters: | kwargs – The keyword arguments corresponding to the fields to exclude on. Multiple arguments are treated as if they are ANDed together. |
---|---|
Returns: | A new EmbeddedDocumentList containing the non-matching
embedded documents. |
Raises AttributeError
if a given keyword is not a valid field for
the embedded document class.
filter
(**kwargs)¶Filters the list by only including embedded documents with the given keyword arguments.
Parameters: | kwargs – The keyword arguments corresponding to the fields to filter on. Multiple arguments are treated as if they are ANDed together. |
---|---|
Returns: | A new EmbeddedDocumentList containing the matching
embedded documents. |
Raises AttributeError
if a given keyword is not a valid field for
the embedded document class.
first
()¶Returns the first embedded document in the list, or None
if empty.
get
(**kwargs)¶Retrieves an embedded document determined by the given keyword arguments.
Parameters: | kwargs – The keyword arguments corresponding to the fields to search on. Multiple arguments are treated as if they are ANDed together. |
---|---|
Returns: | The embedded document matched by the given keyword arguments. |
Raises DoesNotExist
if the arguments used to query an embedded
document returns no results. MultipleObjectsReturned
if more
than one result is returned.
save
(*args, **kwargs)¶Saves the ancestor document.
Parameters: |
|
---|
update
(**update)¶Updates the embedded documents with the given update values.
Note
The embedded document changes are not automatically saved to the database after calling this method.
Parameters: | update – A dictionary of update values to apply to each embedded document. |
---|---|
Returns: | The number of entries updated. |
mongoengine.common.
_import_class
(cls_name)¶Cache mechanism for imports.
Due to complications of circular imports mongoengine needs to do lots of inline imports in functions. This is inefficient as classes are imported repeated throughout the mongoengine code. This is compounded by some recursive functions requiring inline imports.
mongoengine.common
provides a single point to import all these
classes. Circular imports aren’t an issue as it dynamically imports the
class when first needed. Subsequent calls to the
_import_class()
can then directly retrieve the
class from the mongoengine.common._class_registry_cache
.
Fixed EmbeddedDocuments with id also storing _id (#402)
Added get_proxy_object helper to filefields (#391)
Added QuerySetNoCache and QuerySet.no_cache() for lower memory consumption (#365)
Fixed sum and average mapreduce dot notation support (#375, #376, #393)
Fixed as_pymongo to return the id (#386)
Document.select_related() now respects db_alias (#377)
Reload uses shard_key if applicable (#384)
Dynamic fields are ordered based on creation and stored in _fields_ordered (#396)
Potential breaking change: http://docs.mongoengine.org/en/latest/upgrade.html#to-0-8-3
Fixed pickling dynamic documents _dynamic_fields (#387)
Fixed ListField setslice and delslice dirty tracking (#390)
Added Django 1.5 PY3 support (#392)
Added match ($elemMatch) support for EmbeddedDocuments (#379)
Fixed weakref being valid after reload (#374)
Fixed queryset.get() respecting no_dereference (#373)
Added full_result kwarg to update (#380)
DictField
DictField
entries containing strings to use matching operatorsMapField
, similar to DictField
NotRegistered
exception if dereferencing Document
not in the registrysave
, update
, update_one
and get_or_create
Document
__hash__, __ne__ for picklingFileField
optional size arg for read methodFileField
seek and tell methods for reading filesQuerySet.clone
to support copying querysetsQuerySet.all_fields
resets previous .only() and .exclude()QuerySet.exclude
QuerySet.only
subfield supportBaseField
allowing fields to be sorted in the
way the user has specified themGridFSStorage
Django storage backendFileField
for GridFS supportSortedListField
EmailField
GeoPointField
exact
and iexact
match operators to QuerySet
get_document_or_404
and get_list_or_404
Django shortcutsnot
query operatorpop
and add_to_set
__raw__
query parameterDictField
QuerySet.distinct
, QuerySet.create
, QuerySet.snapshot
,
QuerySet.timeout
and QuerySet.all
connect()
now workmin_length
for StringField
contains
, startswith
and endswith
query operators (and
case-insensitive versions that are prefixed with ‘i’)name
parameter, replaced with db_field
QuerySet.only
for only retrieving specific fieldsQuerySet.in_bulk()
for bulk querying using idsQuerySet
s now have a rewind()
method, which is called automatically
when the iterator is exhausted, allowing QuerySet
s to be reusedDictField
URLField
DecimalField
BinaryField
GenericReferenceField
get()
and get_or_create()
methods to QuerySet
ReferenceField
s may now reference the document they are defined on
(recursive references) and documents that have not yet been definedDocument
objects may now be compared for equality (equal if _ids are
equal and documents are of same type)QuerySet
update methods now have an upsert
parameterQ
objects now support regex queryingReferenceField
s may now be queried using their _idEmbeddedDocuments
couldn’t be non-polymorphicqueryset_manager
functions now accept two arguments – the document class
as the first and the queryset as the secondQuerySet.exec_js
ignored Q
objectsListField
sDocument.filter()
added as an alias to Document.__call__()
validate()
may now be used on EmbeddedDocument
sforce_insert
to Document.save()
ListField
and EmbeddedDocumentField
_id
in MongoDB)Q
class for building advanced queriesQuerySet
methods for atomic updates to documentsunique=True
to enforce uniqueness across a
collectionDocument.meta
support for indexes, which are ensured just before
querying takes placeBooleanField
Document.reload()
methodThe 0.8.7 package on pypi was corrupted. If upgrading from 0.8.7 to 0.9.0 please follow:
pip uninstall pymongo
pip uninstall mongoengine
pip install pymongo==2.8
pip install mongoengine
Calling reload on deleted / nonexistent documents now raises a DoesNotExist exception.
Minor change that may impact users:
DynamicDocument fields are now stored in creation order after any declared fields. Previously they were stored alphabetically.
There have been numerous backwards breaking changes in 0.8. The reasons for these are to ensure that MongoEngine has sane defaults going forward and that it performs the best it can out of the box. Where possible there have been FutureWarnings to help get you ready for the change, but that hasn’t been possible for the whole of the release.
Warning
Breaking changes - test upgrading on a test system before putting live. There maybe multiple manual steps in migrating and these are best honed on a staging / test system.
MongoEngine requires python 2.6 (or above) and pymongo 2.5 (or above)
The inheritance model has changed, we no longer need to store an array of
types
with the model we can just use the classname in _cls
.
This means that you will have to update your indexes for each of your
inherited classes like so:
# 1. Declaration of the class
class Animal(Document):
name = StringField()
meta = {
'allow_inheritance': True,
'indexes': ['name']
}
# 2. Remove _types
collection = Animal._get_collection()
collection.update({}, {"$unset": {"_types": 1}}, multi=True)
# 3. Confirm extra data is removed
count = collection.find({'_types': {"$exists": True}}).count()
assert count == 0
# 4. Remove indexes
info = collection.index_information()
indexes_to_drop = [key for key, value in info.iteritems()
if '_types' in dict(value['key'])]
for index in indexes_to_drop:
collection.drop_index(index)
# 5. Recreate indexes
Animal.ensure_indexes()
The default for inheritance has changed - it is now off by default and
_cls
will not be stored automatically with the class. So if you extend
your Document
or EmbeddedDocuments
you will need to declare allow_inheritance
in the meta data like so:
class Animal(Document):
name = StringField()
meta = {'allow_inheritance': True}
Previously, if you had data in the database that wasn’t defined in the Document
definition, it would set it as an attribute on the document. This is no longer
the case and the data is set only in the document._data
dictionary:
>>> from mongoengine import *
>>> class Animal(Document):
... name = StringField()
...
>>> cat = Animal(name="kit", size="small")
# 0.7
>>> cat.size
u'small'
# 0.8
>>> cat.size
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Animal' object has no attribute 'size'
The Document class has introduced a reserved function clean(), which will be called before saving the document. If your document class happens to have a method with the same name, please try to rename it.
- def clean(self):
- pass
ReferenceFields now store ObjectIds by default - this is more efficient than DBRefs as we already know what Document types they reference:
# Old code
class Animal(Document):
name = ReferenceField('self')
# New code to keep dbrefs
class Animal(Document):
name = ReferenceField('self', dbref=True)
To migrate all the references you need to touch each object and mark it as dirty eg:
# Doc definition
class Person(Document):
name = StringField()
parent = ReferenceField('self')
friends = ListField(ReferenceField('self'))
# Mark all ReferenceFields as dirty and save
for p in Person.objects:
p._mark_as_changed('parent')
p._mark_as_changed('friends')
p.save()
An example test migration for ReferenceFields is available on github.
Note
Internally mongoengine handles ReferenceFields the same, so they are converted to DBRef on loading and ObjectIds or DBRefs depending on settings on storage.
UUIDFields now default to storing binary values:
# Old code
class Animal(Document):
uuid = UUIDField()
# New code
class Animal(Document):
uuid = UUIDField(binary=False)
To migrate all the uuids you need to touch each object and mark it as dirty eg:
# Doc definition
class Animal(Document):
uuid = UUIDField()
# Mark all UUIDFields as dirty and save
for a in Animal.objects:
a._mark_as_changed('uuid')
a.save()
An example test migration for UUIDFields is available on github.
DecimalFields now store floats - previously it was storing strings and that made it impossible to do comparisons when querying correctly.:
# Old code
class Person(Document):
balance = DecimalField()
# New code
class Person(Document):
balance = DecimalField(force_string=True)
To migrate all the DecimalFields you need to touch each object and mark it as dirty eg:
# Doc definition
class Person(Document):
balance = DecimalField()
# Mark all DecimalField's as dirty and save
for p in Person.objects:
p._mark_as_changed('balance')
p.save()
Note
DecimalFields have also been improved with the addition of precision
and rounding. See DecimalField
for more information.
An example test migration for DecimalFields is available on github.
To improve performance document saves will no longer automatically cascade. Any changes to a Document’s references will either have to be saved manually or you will have to explicitly tell it to cascade on save:
# At the class level:
class Person(Document):
meta = {'cascade': True}
# Or on save:
my_document.save(cascade=True)
Document and Embedded Documents are now serialized based on declared field order.
Previously, the data was passed to mongodb as a dictionary and which meant that
order wasn’t guaranteed - so things like $addToSet
operations on
EmbeddedDocument
could potentially fail in unexpected
ways.
If this impacts you, you may want to rewrite the objects using the
doc.mark_as_dirty('field')
pattern described above. If you are using a
compound primary key then you will need to ensure the order is fixed and match
your EmbeddedDocument to that order.
Querysets now return clones and should no longer be considered editable in place. This brings us in line with how Django’s querysets work and removes a long running gotcha. If you edit your querysets inplace you will have to update your code like so:
# Old code:
mammals = Animal.objects(type="mammal")
mammals.filter(order="Carnivora") # Returns a cloned queryset that isn't assigned to anything - so this will break in 0.8
[m for m in mammals] # This will return all mammals in 0.8 as the 2nd filter returned a new queryset
# Update example a) assign queryset after a change:
mammals = Animal.objects(type="mammal")
carnivores = mammals.filter(order="Carnivora") # Reassign the new queryset so filter can be applied
[m for m in carnivores] # This will return all carnivores
# Update example b) chain the queryset:
mammals = Animal.objects(type="mammal").filter(order="Carnivora") # The final queryset is assgined to mammals
[m for m in mammals] # This will return all carnivores
If you ever did len(queryset) it previously did a count() under the covers, this caused some unusual issues. As len(queryset) is most often used by list(queryset) we now cache the queryset results and use that for the length.
This isn’t as performant as a count() and if you aren’t iterating the queryset you should upgrade to use count:
# Old code
len(Animal.objects(type="mammal"))
# New code
Animal.objects(type="mammal").count()
The behaviour of .only() was highly ambiguous, now it works in mirror fashion to .exclude(). Chaining .only() calls will increase the fields required:
# Old code
Animal.objects().only(['type', 'name']).only('name', 'order') # Would have returned just `name`
# New code
Animal.objects().only('name')
# Note:
Animal.objects().only(['name']).only('order') # Now returns `name` *and* `order`
PyMongo 2.4 came with a new connection client; MongoClient and started the
depreciation of the old Connection
. MongoEngine
now uses the latest MongoClient for connections. By default operations were
safe but if you turned them off or used the connection directly this will
impact your queries.
safe has been depreciated in the new MongoClient connection. Please use write_concern instead. As safe always defaulted as True normally no code change is required. To disable confirmation of the write just pass {“w”: 0} eg:
# Old
Animal(name="Dinasour").save(safe=False)
# new code:
Animal(name="Dinasour").save(write_concern={"w": 0})
write_options has been replaced with write_concern to bring it inline with pymongo. To upgrade simply rename any instances where you used the write_option keyword to write_concern like so:
# Old code:
Animal(name="Dinasour").save(write_options={"w": 2})
# new code:
Animal(name="Dinasour").save(write_concern={"w": 2})
Index methods are no longer tied to querysets but rather to the document class.
Although QuerySet._ensure_indexes and QuerySet.ensure_index still exist.
They should be replaced with ensure_indexes()
/
ensure_index()
.
SequenceField
now inherits from BaseField to
allow flexible storage of the calculated value. As such MIN and MAX settings
are no longer handled.
Saves will raise a FutureWarning if they cascade and cascade hasn’t been set to True. This is because in 0.8 it will default to False. If you require cascading saves then either set it in the meta or pass via save eg
# At the class level:
class Person(Document):
meta = {'cascade': True}
# Or in code:
my_document.save(cascade=True)
Note
Remember: cascading saves do not cascade through lists.
ReferenceFields now can store references as ObjectId strings instead of DBRefs. This will become the default in 0.8 and if dbref is not set a FutureWarning will be raised.
To explicitly continue to use DBRefs change the dbref flag to True
class Person(Document):
groups = ListField(ReferenceField(Group, dbref=True))
To migrate to using strings instead of DBRefs you will have to manually migrate
# Step 1 - Migrate the model definition
class Group(Document):
author = ReferenceField(User, dbref=False)
members = ListField(ReferenceField(User, dbref=False))
# Step 2 - Migrate the data
for g in Group.objects():
g.author = g.author
g.members = g.members
g.save()
In the 0.6 series we added support for null / zero / false values in item_frequencies. A side effect was to return keys in the value they are stored in rather than as string representations. Your code may need to be updated to handle native types rather than strings keys for the results of item frequency queries.
Binary fields have been updated so that they are native binary types. If you previously were doing str comparisons with binary field values you will have to update and wrap the value in a str.
Embedded Documents - if you had a pk field you will have to rename it from _id to pk as pk is no longer a property of Embedded Documents.
Reverse Delete Rules in Embedded Documents, MapFields and DictFields now throw an InvalidDocument error as they aren’t currently supported.
Document._get_subclasses - Is no longer used and the class method has been removed.
Document.objects.with_id - now raises an InvalidQueryError if used with a filter.
FutureWarning - A future warning has been added to all inherited classes that
don’t define allow_inheritance
in their meta.
You may need to update pyMongo to 2.0 for use with Sharding.
There have been the following backwards incompatibilities from 0.4 to 0.5. The main areas of changed are: choices in fields, map_reduce and collection names.
Are now expected to be an iterable of tuples, with the first element in each tuple being the actual value to be stored. The second element is the human-readable name for the option.
map reduce now requires pymongo 1.11+- The pymongo merge_output and reduce_output parameters, have been depreciated.
More methods now use map_reduce as db.eval is not supported for sharding as such the following have been changed:
Previously it was just lowercase, it’s now much more pythonic and readable as it’s lowercase and underscores, previously
class MyAceDocument(Document):
pass
MyAceDocument._meta['collection'] == myacedocument
In 0.5 this will change to
class MyAceDocument(Document):
pass
MyAceDocument._get_collection_name() == my_ace_document
To upgrade use a Mixin class to set meta like so
class BaseMixin(object):
meta = {
'collection': lambda c: c.__name__.lower()
}
class MyAceDocument(Document, BaseMixin):
pass
MyAceDocument._get_collection_name() == "myacedocument"
Alternatively, you can rename your collections eg
from mongoengine.connection import _get_db
from mongoengine.base import _document_registry
def rename_collections():
db = _get_db()
failure = False
collection_names = [d._get_collection_name()
for d in _document_registry.values()]
for new_style_name in collection_names:
if not new_style_name: # embedded documents don't have collections
continue
old_style_name = new_style_name.replace('_', '')
if old_style_name == new_style_name:
continue # Nothing to do
existing = db.collection_names()
if old_style_name in existing:
if new_style_name in existing:
failure = True
print "FAILED to rename: %s to %s (already exists)" % (
old_style_name, new_style_name)
else:
db[old_style_name].rename(new_style_name)
print "Renamed: %s to %s" % (old_style_name,
new_style_name)
if failure:
print "Upgrading collection names failed"
else:
print "Upgraded collection names"
It’s been reported that indexes may need to be recreated to the newer version of indexes.
To do this drop indexes and call ensure_indexes
on each model.
Note
Django support has been split from the main MongoEngine repository. The legacy Django extension may be found bundled with the 0.9 release of MongoEngine.
The MongoEngine team is looking for help contributing and maintaining a new Django extension for MongoEngine! If you have Django experience and would like to help contribute to the project, please get in touch on the mailing list or by simply contributing on GitHub.