Aymeric Augustin
2011-09-03 15:40:52 UTC
Hello,
The GSoC proposal "Multiple timezone support for datetime representation" wasn't picked up in 2011 and 2010. Although I'm not a student and the summer is over, I'd like to tackle this problem, and I would appreciate it very much if a core developer accepted to mentor me during this work, GSoC-style.
Here is my proposal, following the GSoC guidelines. I apologize for the wall of text; this has been discussed many times in the past 4 years and I've tried to address as many concerns and objections as possible.
Definition of success
---------------------
The goal is to resolve ticket #2626 in Django 1.4 or 1.5 (depending on when 1.4 is released).
Design specification
--------------------
Some background on timezones in Django and Python
.................................................
Currently, Django stores datetime objects in local time in the database, local time being defined by the TIME_ZONE setting. It retrieves them as naive datetime objects. As a consequence, developers work with naive datetime objects in local time.
This approach sort of works when all the users are in the same timezone and don't care about data loss (inconsistencies) when DST kicks in or out. Unfortunately, these assumptions aren't true for many Django projects: for instance, one may want to log sessions (login/logout) for security purposes: that's a 24/7 flow of important data. Read tickets #2626 and #10587 for more details.
Python's standard library provides limited support for timezones, but this gap is filled by pytz <http://pytz.sourceforge.net/>. If you aren't familiar with the topic, strongly recommend reading this page before my proposal. It explains the problems of working in local time and the limitations of Python's APIs. It has a lot of examples, too.
Django should use timezone-aware UTC datetimes internally
.........................................................
Example : datetime.datetime(2011, 09, 23, 8, 34, 12, tzinfo=pytz.utc)
In my opinion, the problem of local time is strikingly similar to the problem character encodings. Django uses only unicode internally and converts at the borders (HTTP requests/responses and database). I propose a similar solution: Django should always use UTC internally, and conversion should happen at the borders, i.e. when rendering the templates and processing POST data (in form fields/widgets). I'll discuss the database in the next section.
Quoting pytz' docs: "The preferred way of dealing with times is to always work in UTC, converting to localtime only when generating output to be read by humans." I think we can trust pytz' developers on this topic.
Note that a timezone-aware UTC datetime is different from a naive datetime. If we were using naive datetimes, and assuming we're using pytz, a developer could write:
mytimezone.localize(datetime_django_gave_me)
which is incorrect, because it will interpret the naive datetime as local time in "mytimezone". With timezone-aware UTC datetime, this kind of errors can't happen, and the equivalent code is:
datetime_django_gave_me.astimezone(mytimezone)
Django should store datetimes in UTC in the database
....................................................
This horse has been beaten to death on this mailing-list so many times that I'll keep the argumentation short. If Django handles everything as UTC internally, it isn't useful to convert to anything else for storage, and re-convert to UTC at retrieval.
In order to make the database portable and interoperable:
- in databases that support timezones (at least PostgreSQL), the timezone should be set to UTC, so that the data is unambiguous;
- in databases that don't (at least SQLite), storing data in UTC is the most reasonable choice: if there's a "default timezone", that's UTC.
I don't intend to change the storage format of datetimes. It has been proposed on this mailing-list to store datetimes with original timezone information. However, I suspect that in many cases, datetimes don't have a significant "original timezone" by themselves. Furthermore, there are many different ways to implemented this outside of Django's core. One is to store a local date + a local time + a place or timezone + is_dst flag and skip datetime entirely. Another is to store an UTC datetime + a place or timezone. In the end, since there's no obvious and consensual way to implement this idea, I've chosen to exclude it from my proposal. See the "Timezone-aware storage of DateTime" thread on this mailing list for a long and non-conclusive discussion of this idea.
I'm expecting to take some flak because of this choice :) Indeed, if you're writing a multi-timezone calendaring application, my work isn't going to resolve all your problems — but it won't hurt either. It may even provide a saner foundation to build upon. Once again, there's more than one way to solve this problem, and I'm afraid that choosing one would offend some people sufficiently to get the entire proposal rejected.
Django should convert between UTC and local time in the templates and forms
...........................................................................
I regard the problem of local time (in which time zone is my user?) as very similar to internationalization (which language does my user read?), and even more to localization (in which country does my user live?), because localization happens both on output and on input.
I want controllable conversion to local time when rendering a datetime in a template. I will introduce:
- a template tag, {% localtime on|off %}, that works exactly like {% localize on|off %}; it will be available with {% load tz %};
- two template filters, {{ datetime|localtime }} and {{ datetime|utctime }}, that work exactly like {{ value|localize }} and {{ value|unlocalize }}.
I will convert datetimes to local time when rendering a DateTimeInput widget, and also handle SplitDateTimeWidget and SplitHiddenDateTimeWidget which are more complicated.
Finally, I will convert datetimes entered by end-users in forms to UTC. I can't think of cases where you'd want an interface in local time but user input in UTC. As a consequence, I don't plan to introduce the equivalent of the `localize` keyword argument in form fields, unless someone brings up a sufficiently general use case.
How to set each user's timezone
...............................
Internationalization and localization are based on the LANGUAGES setting. There's a widely accepted standard to select automatically the proper language and country, the Accept-Language header.
Unfortunately, some countries like the USA have more than one timezone, so country information isn't enough to select a timezone. To the best of my knowledge, there isn't a widely accepted way to determine the timezones of the end users on the web.
I intend to use the TIME_ZONE setting by default and to provide an equivalent of `translation.activate()` for setting the timezone. With this feature, developers can implement their own middleware to set the timezone for each user, for instance they may want to use <http://pytz.sourceforge.net/#country-information>.
This means I'll have to introduce another thread local. I know this is frowned upon. I'd be very interested if someone has a better idea.
It might be no longer necessary to set os.environ['TZ'] and run time.tzset() at all. That would avoid a number of problems and make Windows as well supported as Unix-based OSes — there's a bunch of tickets in Trac about this.
I'm less familiar with this part of the project and I'm interested in advice about how to implement it properly.
Backwards compatibility
.......................
Most previous attempts to resolve have stumbled upon this problem.
I propose to introduce a USE_TZ settings (yes, I know, yet another setting) that works exactly like USE_L10N. If set to False, the default, you will get the legacy (current) behavior. Thus, existing websites won't be affected. If set to True, you will get the new behavior described above.
I will also explain in the release notes how to migrate a database — which means shifting all datetimes to UTC. I will attempt to develop a script to automate this task.
Dependency on pytz
..................
I plan to make pytz a mandatory dependency when USE_TZ is True. This would be similar to the dependency on on gettext when USE_I18N is True.
pytz gets a new release every time the Olson database is updated. For this reason, it's better not to copy it in Django, unlike simplejson and unittest2.
It was split from Zope some time ago. It's a small amount of clean code and it could be maintained within Django if it was abandoned (however unlikely that sounds).
Miscellaneous
.............
The following items have caused bugs in the past and should be checked carefully:
- caching: add timezone to cache key? See #5691.
- functions that use LocalTimezone: naturaltime, timesince, timeuntil, dateformat.
- os.environ['TZ']. See #14264.
- time.tzset() isn't supported on Windows. See #7062.
Finally, my proposal shares some ideas with https://github.com/brosner/django-timezones; I didn't find any documentation, but I intend to review the code.
About me
--------
I've been working with Django since 2008. I'm doing a lot of triage in Trac, I've written some patches (notably r16349, r16539, r16548, also some documentation improvements and bug fixes), and I've helped to set up continuous integration (especially for Oracle). In my day job, I'm producing enterprise software based on Django with a team of ten developers.
Work plan
---------
Besides the research that's about 50% done, and discussion that's going to take place now, I expect the implementation and tests to take me around 80h. Given how much free time I can devote to Django, this means three to six months.
Here's an overview of my work plan:
- Implement the USE_TZ flag and database support — this requires checking the capabilities of each supported database in terms of datetime types and time zone support. Write tests, especially to ensure backwards compatibility. Write docs. (20h)
- Implement timezone localization in templates. Write tests. Write docs. (10h)
- Implement timezone localization in widgets and forms. Check the admin thoroughly. Write tests. Write docs. (15h)
- Implement the utilities to set the user's timezone. Write tests. Write docs. (15h)
- Reviews, etc. (20h)
What's next?
------------
Constructive criticism, obviously :) Remember that the main problems here are backwards-compatibility and keeping things simple.
Best regards,
--
Aymeric.
Annex: Research notes
---------------------
Wiki
....
[GSOC] https://code.djangoproject.com/wiki/SummerOfCode2011#Multipletimezonesupportfordatetimerepresentation
Relevant tickets
................
#2626: canonical ticket for this issue
#2447: dupe, an alternative solution
#8953: dupe, not much info
#10587: dupe, a fairly complete proposal, but doesn't address backwards compatibility for existing data
Relevant related tickets
........................
#14253: how should "now" behave in the admin when "client time" != "server time"?
Irrelevant related tickets
..........................
#11385: make it possible to enter data in a different timezone in DateTimeField
#12666: timezone in the 'Date:' headers of outgoing emails - independant resolution
Relevant threads
................
2011-05-31 Timezone-aware storage of DateTime
http://groups.google.com/group/django-developers/browse_thread/thread/76e2b486d561ab79
2010-08-16 Datetimes with timezones for mysql
https://groups.google.com/group/django-developers/browse_thread/thread/5e220687b7af26f5
2009-03-23 Django internal datetime handling
https://groups.google.com/group/django-developers/browse_thread/thread/ca023360ab457b91
2008-06-25 Proposal: PostgreSQL backends should *stop* using settings.TIME_ZONE
http://groups.google.com/group/django-developers/browse_thread/thread/b8c885389374c040
2007-12-02 Timezone aware datetimes and MySQL (ticket #5304)
https://groups.google.com/group/django-developers/browse_thread/thread/a9d765f83f552fa4
Relevant related threads
........................
2009-11-24 Why not datetime.utcnow() in auto_now/auto_now_add
http://groups.google.com/group/django-developers/browse_thread/thread/4ca560ef33c88bf3
Irrelevant related threads
..........................
2011-07-25 "c" date formating and Internet usage
https://groups.google.com/group/django-developers/browse_thread/thread/61296125a4774291
2011-02-10 GSoC 2011 student contribution
https://groups.google.com/group/django-developers/browse_thread/thread/0596b562cdaeac97/585ce1b04632198a?#585ce1b04632198a
2010-11-04 Changing settings per test
https://groups.google.com/group/django-developers/browse_thread/thread/65aabb45687e572e
2009-09-15 What is the status of auto_now and auto_now_add?
https://groups.google.com/group/django-developers/browse_thread/thread/cd1a76bca6055179
2009-03-09 TimeField broken in Oracle
https://groups.google.com/group/django-developers/browse_thread/thread/bba2f80a2ca9b068
2009-01-12 Rolling back tests -- status and open issues
https://groups.google.com/group/django-developers/browse_thread/thread/1e4f4c840b180895
2008-08-05 Transactional testsuite
https://groups.google.com/group/django-developers/browse_thread/thread/49aa551ad41fb919
--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-developers-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to django-developers+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
The GSoC proposal "Multiple timezone support for datetime representation" wasn't picked up in 2011 and 2010. Although I'm not a student and the summer is over, I'd like to tackle this problem, and I would appreciate it very much if a core developer accepted to mentor me during this work, GSoC-style.
Here is my proposal, following the GSoC guidelines. I apologize for the wall of text; this has been discussed many times in the past 4 years and I've tried to address as many concerns and objections as possible.
Definition of success
---------------------
The goal is to resolve ticket #2626 in Django 1.4 or 1.5 (depending on when 1.4 is released).
Design specification
--------------------
Some background on timezones in Django and Python
.................................................
Currently, Django stores datetime objects in local time in the database, local time being defined by the TIME_ZONE setting. It retrieves them as naive datetime objects. As a consequence, developers work with naive datetime objects in local time.
This approach sort of works when all the users are in the same timezone and don't care about data loss (inconsistencies) when DST kicks in or out. Unfortunately, these assumptions aren't true for many Django projects: for instance, one may want to log sessions (login/logout) for security purposes: that's a 24/7 flow of important data. Read tickets #2626 and #10587 for more details.
Python's standard library provides limited support for timezones, but this gap is filled by pytz <http://pytz.sourceforge.net/>. If you aren't familiar with the topic, strongly recommend reading this page before my proposal. It explains the problems of working in local time and the limitations of Python's APIs. It has a lot of examples, too.
Django should use timezone-aware UTC datetimes internally
.........................................................
Example : datetime.datetime(2011, 09, 23, 8, 34, 12, tzinfo=pytz.utc)
In my opinion, the problem of local time is strikingly similar to the problem character encodings. Django uses only unicode internally and converts at the borders (HTTP requests/responses and database). I propose a similar solution: Django should always use UTC internally, and conversion should happen at the borders, i.e. when rendering the templates and processing POST data (in form fields/widgets). I'll discuss the database in the next section.
Quoting pytz' docs: "The preferred way of dealing with times is to always work in UTC, converting to localtime only when generating output to be read by humans." I think we can trust pytz' developers on this topic.
Note that a timezone-aware UTC datetime is different from a naive datetime. If we were using naive datetimes, and assuming we're using pytz, a developer could write:
mytimezone.localize(datetime_django_gave_me)
which is incorrect, because it will interpret the naive datetime as local time in "mytimezone". With timezone-aware UTC datetime, this kind of errors can't happen, and the equivalent code is:
datetime_django_gave_me.astimezone(mytimezone)
Django should store datetimes in UTC in the database
....................................................
This horse has been beaten to death on this mailing-list so many times that I'll keep the argumentation short. If Django handles everything as UTC internally, it isn't useful to convert to anything else for storage, and re-convert to UTC at retrieval.
In order to make the database portable and interoperable:
- in databases that support timezones (at least PostgreSQL), the timezone should be set to UTC, so that the data is unambiguous;
- in databases that don't (at least SQLite), storing data in UTC is the most reasonable choice: if there's a "default timezone", that's UTC.
I don't intend to change the storage format of datetimes. It has been proposed on this mailing-list to store datetimes with original timezone information. However, I suspect that in many cases, datetimes don't have a significant "original timezone" by themselves. Furthermore, there are many different ways to implemented this outside of Django's core. One is to store a local date + a local time + a place or timezone + is_dst flag and skip datetime entirely. Another is to store an UTC datetime + a place or timezone. In the end, since there's no obvious and consensual way to implement this idea, I've chosen to exclude it from my proposal. See the "Timezone-aware storage of DateTime" thread on this mailing list for a long and non-conclusive discussion of this idea.
I'm expecting to take some flak because of this choice :) Indeed, if you're writing a multi-timezone calendaring application, my work isn't going to resolve all your problems — but it won't hurt either. It may even provide a saner foundation to build upon. Once again, there's more than one way to solve this problem, and I'm afraid that choosing one would offend some people sufficiently to get the entire proposal rejected.
Django should convert between UTC and local time in the templates and forms
...........................................................................
I regard the problem of local time (in which time zone is my user?) as very similar to internationalization (which language does my user read?), and even more to localization (in which country does my user live?), because localization happens both on output and on input.
I want controllable conversion to local time when rendering a datetime in a template. I will introduce:
- a template tag, {% localtime on|off %}, that works exactly like {% localize on|off %}; it will be available with {% load tz %};
- two template filters, {{ datetime|localtime }} and {{ datetime|utctime }}, that work exactly like {{ value|localize }} and {{ value|unlocalize }}.
I will convert datetimes to local time when rendering a DateTimeInput widget, and also handle SplitDateTimeWidget and SplitHiddenDateTimeWidget which are more complicated.
Finally, I will convert datetimes entered by end-users in forms to UTC. I can't think of cases where you'd want an interface in local time but user input in UTC. As a consequence, I don't plan to introduce the equivalent of the `localize` keyword argument in form fields, unless someone brings up a sufficiently general use case.
How to set each user's timezone
...............................
Internationalization and localization are based on the LANGUAGES setting. There's a widely accepted standard to select automatically the proper language and country, the Accept-Language header.
Unfortunately, some countries like the USA have more than one timezone, so country information isn't enough to select a timezone. To the best of my knowledge, there isn't a widely accepted way to determine the timezones of the end users on the web.
I intend to use the TIME_ZONE setting by default and to provide an equivalent of `translation.activate()` for setting the timezone. With this feature, developers can implement their own middleware to set the timezone for each user, for instance they may want to use <http://pytz.sourceforge.net/#country-information>.
This means I'll have to introduce another thread local. I know this is frowned upon. I'd be very interested if someone has a better idea.
It might be no longer necessary to set os.environ['TZ'] and run time.tzset() at all. That would avoid a number of problems and make Windows as well supported as Unix-based OSes — there's a bunch of tickets in Trac about this.
I'm less familiar with this part of the project and I'm interested in advice about how to implement it properly.
Backwards compatibility
.......................
Most previous attempts to resolve have stumbled upon this problem.
I propose to introduce a USE_TZ settings (yes, I know, yet another setting) that works exactly like USE_L10N. If set to False, the default, you will get the legacy (current) behavior. Thus, existing websites won't be affected. If set to True, you will get the new behavior described above.
I will also explain in the release notes how to migrate a database — which means shifting all datetimes to UTC. I will attempt to develop a script to automate this task.
Dependency on pytz
..................
I plan to make pytz a mandatory dependency when USE_TZ is True. This would be similar to the dependency on on gettext when USE_I18N is True.
pytz gets a new release every time the Olson database is updated. For this reason, it's better not to copy it in Django, unlike simplejson and unittest2.
It was split from Zope some time ago. It's a small amount of clean code and it could be maintained within Django if it was abandoned (however unlikely that sounds).
Miscellaneous
.............
The following items have caused bugs in the past and should be checked carefully:
- caching: add timezone to cache key? See #5691.
- functions that use LocalTimezone: naturaltime, timesince, timeuntil, dateformat.
- os.environ['TZ']. See #14264.
- time.tzset() isn't supported on Windows. See #7062.
Finally, my proposal shares some ideas with https://github.com/brosner/django-timezones; I didn't find any documentation, but I intend to review the code.
About me
--------
I've been working with Django since 2008. I'm doing a lot of triage in Trac, I've written some patches (notably r16349, r16539, r16548, also some documentation improvements and bug fixes), and I've helped to set up continuous integration (especially for Oracle). In my day job, I'm producing enterprise software based on Django with a team of ten developers.
Work plan
---------
Besides the research that's about 50% done, and discussion that's going to take place now, I expect the implementation and tests to take me around 80h. Given how much free time I can devote to Django, this means three to six months.
Here's an overview of my work plan:
- Implement the USE_TZ flag and database support — this requires checking the capabilities of each supported database in terms of datetime types and time zone support. Write tests, especially to ensure backwards compatibility. Write docs. (20h)
- Implement timezone localization in templates. Write tests. Write docs. (10h)
- Implement timezone localization in widgets and forms. Check the admin thoroughly. Write tests. Write docs. (15h)
- Implement the utilities to set the user's timezone. Write tests. Write docs. (15h)
- Reviews, etc. (20h)
What's next?
------------
Constructive criticism, obviously :) Remember that the main problems here are backwards-compatibility and keeping things simple.
Best regards,
--
Aymeric.
Annex: Research notes
---------------------
Wiki
....
[GSOC] https://code.djangoproject.com/wiki/SummerOfCode2011#Multipletimezonesupportfordatetimerepresentation
Relevant tickets
................
#2626: canonical ticket for this issue
#2447: dupe, an alternative solution
#8953: dupe, not much info
#10587: dupe, a fairly complete proposal, but doesn't address backwards compatibility for existing data
Relevant related tickets
........................
#14253: how should "now" behave in the admin when "client time" != "server time"?
Irrelevant related tickets
..........................
#11385: make it possible to enter data in a different timezone in DateTimeField
#12666: timezone in the 'Date:' headers of outgoing emails - independant resolution
Relevant threads
................
2011-05-31 Timezone-aware storage of DateTime
http://groups.google.com/group/django-developers/browse_thread/thread/76e2b486d561ab79
2010-08-16 Datetimes with timezones for mysql
https://groups.google.com/group/django-developers/browse_thread/thread/5e220687b7af26f5
2009-03-23 Django internal datetime handling
https://groups.google.com/group/django-developers/browse_thread/thread/ca023360ab457b91
2008-06-25 Proposal: PostgreSQL backends should *stop* using settings.TIME_ZONE
http://groups.google.com/group/django-developers/browse_thread/thread/b8c885389374c040
2007-12-02 Timezone aware datetimes and MySQL (ticket #5304)
https://groups.google.com/group/django-developers/browse_thread/thread/a9d765f83f552fa4
Relevant related threads
........................
2009-11-24 Why not datetime.utcnow() in auto_now/auto_now_add
http://groups.google.com/group/django-developers/browse_thread/thread/4ca560ef33c88bf3
Irrelevant related threads
..........................
2011-07-25 "c" date formating and Internet usage
https://groups.google.com/group/django-developers/browse_thread/thread/61296125a4774291
2011-02-10 GSoC 2011 student contribution
https://groups.google.com/group/django-developers/browse_thread/thread/0596b562cdaeac97/585ce1b04632198a?#585ce1b04632198a
2010-11-04 Changing settings per test
https://groups.google.com/group/django-developers/browse_thread/thread/65aabb45687e572e
2009-09-15 What is the status of auto_now and auto_now_add?
https://groups.google.com/group/django-developers/browse_thread/thread/cd1a76bca6055179
2009-03-09 TimeField broken in Oracle
https://groups.google.com/group/django-developers/browse_thread/thread/bba2f80a2ca9b068
2009-01-12 Rolling back tests -- status and open issues
https://groups.google.com/group/django-developers/browse_thread/thread/1e4f4c840b180895
2008-08-05 Transactional testsuite
https://groups.google.com/group/django-developers/browse_thread/thread/49aa551ad41fb919
--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To post to this group, send email to django-developers-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to django-developers+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.