11Python Twitter Search API
22=========================
33
4- This library serves as a Python interface to the various `Twitter
5- premium and enterprise search
6- APIs <https://developer.twitter.com/en/products/tweets/search> `__. It
7- provides a command-line utility and a library usable from within a
8- Python program. It comes with tools for assisting in dynamic generation
9- of search rules and for parsing tweets.
10-
11- Pretty docs can be seen
12- `here <https://twitterdev.github.io/search-tweets-python/ >`__.
4+ This project serves as a wrapper for the `Twitter premium and enterprise
5+ search
6+ APIs <https://developer.twitter.com/en/products/tweets/search> `__,
7+ providing a command-line utility and a Python library. Pretty docs can
8+ be seen `here <https://twitterdev.github.io/search-tweets-python/ >`__.
139
1410Features
1511========
1612
1713- Supports 30-day Search and Full Archive Search (not the standard
18- Search API).
14+ Search API at this time ).
1915- Command-line utility is pipeable to other tools (e.g., ``jq ``).
20- - Automatically handles pagination of results with specifiable limits
16+ - Automatically handles pagination of search results with specifiable
17+ limits
2118- Delivers a stream of data to the user for low in-memory requirements
22- - Handles Enterprise and Premium authentication methods
19+ - Handles enterprise and premium authentication methods
2320- Flexible usage within a python program
24- - Compatible with our group's Tweet Parser for rapid extraction of
25- relevant data fields from each tweet payload
21+ - Compatible with our group's `Tweet
22+ Parser <https://github.com/twitterdev/tweet_parser> `__ for rapid
23+ extraction of relevant data fields from each tweet payload
2624- Supports the Search Counts endpoint, which can reduce API call usage
27- and provide rapid insights if you only need volumes and not tweet
28- payloads
25+ and provide rapid insights if you only need Tweet volumes and not
26+ Tweet payloads
2927
3028Installation
3129============
@@ -51,46 +49,88 @@ Credential Handling
5149
5250The premium and enterprise Search APIs use different authentication
5351methods and we attempt to provide a seamless way to handle
54- authentication for all customers. We support both YAML-file based
55- methods and environment variables for access.
52+ authentication for all customers.
53+
54+ Premium clients will require the ``bearer_token `` and ``endpoint ``
55+ fields; Enterprise clients require ``username ``, ``password ``, and
56+ ``endpoint ``. If you do not specify the ``account_type ``, we attempt to
57+ discern the account type and declare a warning about this behavior.
58+
59+ For premium search products, we are using app-only authentication and
60+ the bearer tokens are not delivered with an expiration time. They can be
61+ invalidated. Please see
62+ `here <https://developer.twitter.com/en/docs/basics/authentication/overview/application-only >`__
63+ for an overview of the premium authentication method.
5664
57- A YAML credential file should look like this:
65+ We support both YAML-file based methods and environment variables for
66+ storing credentials, and provide flexible handling with sensible
67+ defaults.
5868
59- .. code :: .yaml
69+ YAML method
70+ -----------
6071
61- <key>:
62- account_type: <OPTIONAL PREMIUM_OR_ENTERPRISE>
72+ For premium customers, the simplest credential file should look like
73+ this:
74+
75+ .. code :: yaml
76+
77+ search_tweets_api :
78+ account_type : premium
6379 endpoint : <FULL_URL_OF_ENDPOINT>
64- username: <USERNAME>
65- password: <PW>
6680 bearer_token : <TOKEN>
6781
68- Premium clients will require the ``bearer_token `` and ``endpoint ``
69- fields; Enterprise clients require ``username ``, ``password ``, and
70- ``endpoint ``. If you do not specify the ``account_type ``, we attempt to
71- discern the account type and declare a warning about this behavior. The
72- ``load_credentials `` function also allows ``account_type `` to be set.
82+ For enterprise customers, the simplest credential file should look like
83+ this:
84+
85+ .. code :: yaml
86+
87+ search_tweets_api :
88+ account_type : enterprise
89+ endpoint : <FULL_URL_OF_ENDPOINT>
90+ username : <USERNAME>
91+ password : <PW>
7392
74- Our credential reader will look for this file at
93+ By default, this library expects this file at
7594``"~/.twitter_keys.yaml" ``, but you can pass the relevant location as
76- needed. You can also specify a different key in the yaml file, which can
77- be useful if you have different endpoints, e.g., ``dev ``, ``test ``,
78- ``prod ``, etc. The file might look like this:
95+ needed, either with the ``--credential-file `` flag for the command-line
96+ app or as demonstrated below in a Python program.
7997
80- .. code :: .yaml
98+ Both above examples require no special command-line arguments or
99+ in-program arguments. The credential parsing methods, unless otherwise
100+ specified, will look for a YAML key called ``search_tweets_api ``.
101+
102+ For developers who have multiple endpoints and/or search products, you
103+ can keep all credentials in the same file and specify specific keys to
104+ use. ``--credential-file-key `` specifies this behavior in the command
105+ line app. An example:
106+
107+ .. code :: yaml
81108
82- search_tweets_dev :
109+ search_tweets_30_day_dev :
83110 account_type : premium
84111 endpoint : <FULL_URL_OF_ENDPOINT>
85112 bearer_token : <TOKEN>
86113
87- search_tweets_prod :
114+ search_tweets_30_day_prod :
88115 account_type : premium
89116 endpoint : <FULL_URL_OF_ENDPOINT>
90117 bearer_token : <TOKEN>
91118
119+ search_tweets_fullarchive_dev :
120+ account_type : premium
121+ endpoint : <FULL_URL_OF_ENDPOINT>
122+ bearer_token : <TOKEN>
123+
124+ search_tweets_fullarchive_prod :
125+ account_type : premium
126+ endpoint : <FULL_URL_OF_ENDPOINT>
127+ bearer_token : <TOKEN>
128+
129+ Environment Variables
130+ ---------------------
131+
92132If you want or need to pass credentials via environment variables, you
93- can set the appropriate variables of the following:
133+ can set the appropriate variables for your product of the following:
94134
95135::
96136
@@ -101,13 +141,14 @@ can set the appropriate variables of the following:
101141 export SEARCHTWEETS_ACCOUNT_TYPE=
102142
103143The ``load_credentials `` function will attempt to find these variables
104- if it cannot load fields from the yaml file, and it will **overwrite any
105- found credentials from the YAML file ** if they have been parsed. This
106- behavior can be changed by setting the ``load_credentials `` parameter
107- ``env_overwrite `` to ``False ``.
144+ if it cannot load fields from the YAML file, and it will **overwrite any
145+ credentials from the YAML file that are present as environment
146+ variables ** if they have been parsed. This behavior can be changed by
147+ setting the ``load_credentials `` parameter ``env_overwrite `` to
148+ ``False ``.
108149
109- The following cells demonstrates credential handling, both in the
110- command line app and Python library.
150+ The following cells demonstrates credential handling in the Python
151+ library.
111152
112153.. code :: python
113154
@@ -145,22 +186,33 @@ regardless of a YAML file's validity or existence.
145186.. code :: python
146187
147188 import os
148- os.environ[" SEARCHTWEETS_USERNAME" ] = " ENV_USERNAME"
149- os.environ[" SEARCHTWEETS_PASSWORD" ] = " ENV_PW"
150- os.environ[" SEARCHTWEETS_ENDPOINT" ] = " https://endpoint"
189+ os.environ[" SEARCHTWEETS_USERNAME" ] = " < ENV_USERNAME> "
190+ os.environ[" SEARCHTWEETS_PASSWORD" ] = " < ENV_PW> "
191+ os.environ[" SEARCHTWEETS_ENDPOINT" ] = " < https://endpoint> "
151192
152- load_credentials(filename = " nothing " , yaml_key = " no_key_here" )
193+ load_credentials(filename = " nothing_here.yaml " , yaml_key = " no_key_here" )
153194
154195 ::
155196
156- cannot read file nothing
197+ cannot read file nothing_here.yaml
157198 Error parsing YAML file; searching for valid environment variables
158199
159200::
160201
161- {'endpoint': 'https://endpoint',
162- 'password': 'ENV_PW',
163- 'username': 'ENV_USERNAME'}
202+ {'endpoint': '<https://endpoint>',
203+ 'password': '<ENV_PW>',
204+ 'username': '<ENV_USERNAME>'}
205+
206+ Command-line app
207+ ----------------
208+
209+ the flags:
210+
211+ - ``--credential-file <FILENAME> ``
212+ - ``--credential-file-key <KEY> ``
213+ - ``--env-overwrite ``
214+
215+ are used to control credential behavior from the command-line app.
164216
165217--------------
166218
@@ -171,11 +223,11 @@ The library includes an application, ``search_tweets.py``, in the
171223``tools `` directory that provides rapid access to Tweets.
172224
173225Note that the ``--results-per-call `` flag specifies an argument to the
174- API call ( ``maxResults ``, results returned per CALL), not as a hard max
175- to number of results returned from this program. The argument
226+ API ( ``maxResults ``, results returned per CALL), not as a hard max to
227+ number of results returned from this program. The argument
176228``--max-results `` defines the maximum number of results to return from a
177229given call. All examples assume that your credentials are set up
178- correctly in a default location - ``.twitter_keys.yaml `` or in
230+ correctly in the default location - ``.twitter_keys.yaml `` or in
179231environment variables.
180232
181233**Stream json results to stdout without saving **
@@ -210,8 +262,8 @@ environment variables.
210262 --filename-prefix beyonce_geo \
211263 --no-print-stream
212264
213- Options can be passed via a configuration file (either ini or YAML). An
214- example file can be found in the ``tools/api_config_example.config `` or
265+ Options can be passed via a configuration file (either ini or YAML).
266+ Example files can be found in the ``tools/api_config_example.config `` or
215267``./tools/api_yaml_example.yaml `` files, which might look like this:
216268
217269.. code :: bash
@@ -270,18 +322,18 @@ Full options are listed below:
270322
271323 $ search_tweets.py -h
272324 usage: search_tweets.py [-h] [--credential-file CREDENTIAL_FILE]
273- [--credential-file-key CREDENTIAL_YAML_KEY]
274- [--env-overwrite ENV_OVERWRITE]
275- [--config-file CONFIG_FILENAME]
276- [--account-type {premium,enterprise}]
277- [--count-bucket COUNT_BUCKET]
278- [--start-datetime FROM_DATE] [--end-datetime TO_DATE]
279- [--filter-rule PT_RULE]
280- [--results-per-call RESULTS_PER_CALL]
281- [--max-results MAX_RESULTS] [--max-pages MAX_PAGES]
282- [--results-per-file RESULTS_PER_FILE]
283- [--filename-prefix FILENAME_PREFIX]
284- [--no-print-stream] [--print-stream] [--debug]
325+ [--credential-file-key CREDENTIAL_YAML_KEY]
326+ [--env-overwrite ENV_OVERWRITE]
327+ [--config-file CONFIG_FILENAME]
328+ [--account-type {premium,enterprise}]
329+ [--count-bucket COUNT_BUCKET]
330+ [--start-datetime FROM_DATE] [--end-datetime TO_DATE]
331+ [--filter-rule PT_RULE]
332+ [--results-per-call RESULTS_PER_CALL]
333+ [--max-results MAX_RESULTS] [--max-pages MAX_PAGES]
334+ [--results-per-file RESULTS_PER_FILE]
335+ [--filename-prefix FILENAME_PREFIX]
336+ [--no-print-stream] [--print-stream] [--debug]
285337
286338 optional arguments:
287339 -h, --help show this help message and exit
@@ -319,10 +371,10 @@ Full options are listed below:
319371 Number of results to return per call (default 100; max
320372 500) - corresponds to 'maxResults' in the API
321373 --max-results MAX_RESULTS
322- Maximum results to return for this session (defaults
323- to 500; see -a option
374+ Maximum number of Tweets or Counts to return for this
375+ session (defaults to 500)
324376 --max-pages MAX_PAGES
325- Maximum number of pages/api calls to use for this
377+ Maximum number of pages/API calls to use for this
326378 session.
327379 --results-per-file RESULTS_PER_FILE
328380 Maximum tweets to save per file.
@@ -336,7 +388,7 @@ Full options are listed below:
336388--------------
337389
338390Using the Twitter Search APIs' Python Wrapper
339- ============================================
391+ =============================================
340392
341393Working with the API within a Python program is straightforward both for
342394Premium and Enterprise clients.
@@ -621,21 +673,27 @@ Our results are pretty straightforward and can be rapidly used.
621673Dated searches / Full Archive Search
622674------------------------------------
623675
624- Let's make a new rule and pass it dates this time.
625-
626- ``gen_rule_payload `` takes dates of the forms ``YYYY-mm-DD `` and
627- ``YYYYmmDD ``.
628-
629676**Note that this will only work with the full archive search option **,
630677which is available to my account only via the enterprise options. Full
631678archive search will likely require a different endpoint or access
632679method; please see your developer console for details.
633680
681+ Let's make a new rule and pass it dates this time.
682+
683+ ``gen_rule_payload `` takes timestamps of the following forms:
684+
685+ - ``YYYYmmDDHHMM ``
686+ - ``YYYY-mm-DD `` (which will convert to midnight UTC (00:00)
687+ - ``YYYY-mm-DD HH:MM ``
688+ - ``YYYY-mm-DDTHH:MM ``
689+
690+ Note - all Tweets are stored in UTC time.
691+
634692.. code :: python
635693
636694 rule = gen_rule_payload(" from:jack" ,
637- from_date = " 2017-09-01" ,
638- to_date = " 2017-10-30" ,
695+ from_date = " 2017-09-01" , # UTC 2017-09-01 00:00
696+ to_date = " 2017-10-30" ,# UTC 2017-10-30 00:00
639697 results_per_call = 500 )
640698 print (rule)
641699
0 commit comments