11Python Twitter Search API
22=========================
33
4- This library serves as a Python interface to the various `Twitter
5- premium and enterprise search
6- APIs <https://developer.twitter.com/en/products/tweets/search> `__. It
7- provides a command-line utility and a library usable from within a
8- Python program. It comes with tools for assisting in dynamic generation
9- of search rules and for parsing tweets.
10-
11- Pretty docs can be seen
12- `here <https://twitterdev.github.io/search-tweets-python/ >`__.
4+ This project serves as a wrapper for the `Twitter premium and enterprise
5+ search
6+ APIs <https://developer.twitter.com/en/products/tweets/search> `__,
7+ providing a command-line utility and a Python library. Pretty docs can
8+ be seen `here <https://twitterdev.github.io/search-tweets-python/ >`__.
139
1410Features
1511========
1612
1713- Supports 30-day Search and Full Archive Search (not the standard
18- Search API).
14+ Search API at this time ).
1915- Command-line utility is pipeable to other tools (e.g., ``jq ``).
20- - Automatically handles pagination of results with specifiable limits
16+ - Automatically handles pagination of search results with specifiable
17+ limits
2118- Delivers a stream of data to the user for low in-memory requirements
22- - Handles Enterprise and Premium authentication methods
19+ - Handles enterprise and premium authentication methods
2320- Flexible usage within a python program
24- - Compatible with our group's Tweet Parser for rapid extraction of
25- relevant data fields from each tweet payload
21+ - Compatible with our group's `Tweet
22+ Parser <https://github.com/twitterdev/tweet_parser> `__ for rapid
23+ extraction of relevant data fields from each tweet payload
2624- Supports the Search Counts endpoint, which can reduce API call usage
27- and provide rapid insights if you only need volumes and not tweet
28- payloads
25+ and provide rapid insights if you only need Tweet volumes and not
26+ Tweet payloads
2927
3028Installation
3129============
@@ -51,46 +49,81 @@ Credential Handling
5149
5250The premium and enterprise Search APIs use different authentication
5351methods and we attempt to provide a seamless way to handle
54- authentication for all customers. We support both YAML-file based
55- methods and environment variables for access.
52+ authentication for all customers.
53+
54+ Premium clients will require the ``bearer_token `` and ``endpoint ``
55+ fields; Enterprise clients require ``username ``, ``password ``, and
56+ ``endpoint ``. If you do not specify the ``account_type ``, we attempt to
57+ discern the account type and declare a warning about this behavior.
58+
59+ We support both YAML-file based methods and environment variables for
60+ access, and provide flexible handling with sensible defaults.
5661
57- A YAML credential file should look like this:
62+ YAML method
63+ -----------
5864
59- .. code :: .yaml
65+ For premium customers, the simplest credential file should look like
66+ this:
6067
61- <key>:
62- account_type: <OPTIONAL PREMIUM_OR_ENTERPRISE>
68+ .. code :: yaml
69+
70+ search_tweets_api :
71+ account_type : premium
6372 endpoint : <FULL_URL_OF_ENDPOINT>
64- username: <USERNAME>
65- password: <PW>
6673 bearer_token : <TOKEN>
6774
68- Premium clients will require the ``bearer_token `` and ``endpoint ``
69- fields; Enterprise clients require ``username ``, ``password ``, and
70- ``endpoint ``. If you do not specify the ``account_type ``, we attempt to
71- discern the account type and declare a warning about this behavior. The
72- ``load_credentials `` function also allows ``account_type `` to be set.
75+ For enterprise customers, the simplest credential file should look like
76+ this:
77+
78+ .. code :: yaml
7379
74- Our credential reader will look for this file at
80+ search_tweets_api :
81+ account_type : enterprise
82+ endpoint : <FULL_URL_OF_ENDPOINT>
83+ username : <USERNAME>
84+ password : <PW>
85+
86+ By default, this library expects this file at
7587``"~/.twitter_keys.yaml" ``, but you can pass the relevant location as
76- needed. You can also specify a different key in the yaml file, which can
77- be useful if you have different endpoints, e.g., ``dev ``, ``test ``,
78- ``prod ``, etc. The file might look like this:
88+ needed, either with the ``--credential-file `` flag for the command-line
89+ app or as demonstrated below in a Python program.
90+
91+ Both above examples require no special command-line arguments or
92+ in-program arguments. The credential parsing methods, unless otherwise
93+ specified, will look for a YAML key called ``search_tweets_api ``.
94+
95+ For developers who have multiple endpoints and/or search products, you
96+ can keep all credentials in the same file and specify specific keys to
97+ use. ``--credential-file-key `` specifies this behavior in the command
98+ line app. An example:
99+
100+ .. code :: yaml
101+
102+ search_tweets_30_day_dev :
103+ account_type : premium
104+ endpoint : <FULL_URL_OF_ENDPOINT>
105+ bearer_token : <TOKEN>
79106
80- .. code :: .yaml
107+ search_tweets_30_day_prod :
108+ account_type : premium
109+ endpoint : <FULL_URL_OF_ENDPOINT>
110+ bearer_token : <TOKEN>
81111
82- search_tweets_dev :
112+ search_tweets_fullarchive_dev :
83113 account_type : premium
84114 endpoint : <FULL_URL_OF_ENDPOINT>
85115 bearer_token : <TOKEN>
86116
87- search_tweets_prod :
117+ search_tweets_fullarchive_prod :
88118 account_type : premium
89119 endpoint : <FULL_URL_OF_ENDPOINT>
90120 bearer_token : <TOKEN>
91121
122+ Environment Variables
123+ ---------------------
124+
92125If you want or need to pass credentials via environment variables, you
93- can set the appropriate variables of the following:
126+ can set the appropriate variables for your product of the following:
94127
95128::
96129
@@ -101,13 +134,14 @@ can set the appropriate variables of the following:
101134 export SEARCHTWEETS_ACCOUNT_TYPE=
102135
103136The ``load_credentials `` function will attempt to find these variables
104- if it cannot load fields from the yaml file, and it will **overwrite any
105- found credentials from the YAML file ** if they have been parsed. This
106- behavior can be changed by setting the ``load_credentials `` parameter
107- ``env_overwrite `` to ``False ``.
137+ if it cannot load fields from the YAML file, and it will **overwrite any
138+ credentials from the YAML file that are present as environment
139+ variables ** if they have been parsed. This behavior can be changed by
140+ setting the ``load_credentials `` parameter ``env_overwrite `` to
141+ ``False ``.
108142
109- The following cells demonstrates credential handling, both in the
110- command line app and Python library.
143+ The following cells demonstrates credential handling in the Python
144+ library.
111145
112146.. code :: python
113147
@@ -145,22 +179,33 @@ regardless of a YAML file's validity or existence.
145179.. code :: python
146180
147181 import os
148- os.environ[" SEARCHTWEETS_USERNAME" ] = " ENV_USERNAME"
149- os.environ[" SEARCHTWEETS_PASSWORD" ] = " ENV_PW"
150- os.environ[" SEARCHTWEETS_ENDPOINT" ] = " https://endpoint"
182+ os.environ[" SEARCHTWEETS_USERNAME" ] = " < ENV_USERNAME> "
183+ os.environ[" SEARCHTWEETS_PASSWORD" ] = " < ENV_PW> "
184+ os.environ[" SEARCHTWEETS_ENDPOINT" ] = " < https://endpoint> "
151185
152- load_credentials(filename = " nothing " , yaml_key = " no_key_here" )
186+ load_credentials(filename = " nothing_here.yaml " , yaml_key = " no_key_here" )
153187
154188 ::
155189
156- cannot read file nothing
190+ cannot read file nothing_here.yaml
157191 Error parsing YAML file; searching for valid environment variables
158192
159193::
160194
161- {'endpoint': 'https://endpoint',
162- 'password': 'ENV_PW',
163- 'username': 'ENV_USERNAME'}
195+ {'endpoint': '<https://endpoint>',
196+ 'password': '<ENV_PW>',
197+ 'username': '<ENV_USERNAME>'}
198+
199+ Command-line app
200+ ----------------
201+
202+ the flags:
203+
204+ - ``--credential-file <FILENAME> ``
205+ - ``--credential-file-key <KEY> ``
206+ - ``--env-overwrite ``
207+
208+ are used to control credential behavior from the command-line app.
164209
165210--------------
166211
@@ -171,11 +216,11 @@ The library includes an application, ``search_tweets.py``, in the
171216``tools `` directory that provides rapid access to Tweets.
172217
173218Note that the ``--results-per-call `` flag specifies an argument to the
174- API call ( ``maxResults ``, results returned per CALL), not as a hard max
175- to number of results returned from this program. The argument
219+ API ( ``maxResults ``, results returned per CALL), not as a hard max to
220+ number of results returned from this program. The argument
176221``--max-results `` defines the maximum number of results to return from a
177222given call. All examples assume that your credentials are set up
178- correctly in a default location - ``.twitter_keys.yaml `` or in
223+ correctly in the default location - ``.twitter_keys.yaml `` or in
179224environment variables.
180225
181226**Stream json results to stdout without saving **
@@ -210,8 +255,8 @@ environment variables.
210255 --filename-prefix beyonce_geo \
211256 --no-print-stream
212257
213- Options can be passed via a configuration file (either ini or YAML). An
214- example file can be found in the ``tools/api_config_example.config `` or
258+ Options can be passed via a configuration file (either ini or YAML).
259+ Example files can be found in the ``tools/api_config_example.config `` or
215260``./tools/api_yaml_example.yaml `` files, which might look like this:
216261
217262.. code :: bash
@@ -270,18 +315,18 @@ Full options are listed below:
270315
271316 $ search_tweets.py -h
272317 usage: search_tweets.py [-h] [--credential-file CREDENTIAL_FILE]
273- [--credential-file-key CREDENTIAL_YAML_KEY]
274- [--env-overwrite ENV_OVERWRITE]
275- [--config-file CONFIG_FILENAME]
276- [--account-type {premium,enterprise}]
277- [--count-bucket COUNT_BUCKET]
278- [--start-datetime FROM_DATE] [--end-datetime TO_DATE]
279- [--filter-rule PT_RULE]
280- [--results-per-call RESULTS_PER_CALL]
281- [--max-results MAX_RESULTS] [--max-pages MAX_PAGES]
282- [--results-per-file RESULTS_PER_FILE]
283- [--filename-prefix FILENAME_PREFIX]
284- [--no-print-stream] [--print-stream] [--debug]
318+ [--credential-file-key CREDENTIAL_YAML_KEY]
319+ [--env-overwrite ENV_OVERWRITE]
320+ [--config-file CONFIG_FILENAME]
321+ [--account-type {premium,enterprise}]
322+ [--count-bucket COUNT_BUCKET]
323+ [--start-datetime FROM_DATE] [--end-datetime TO_DATE]
324+ [--filter-rule PT_RULE]
325+ [--results-per-call RESULTS_PER_CALL]
326+ [--max-results MAX_RESULTS] [--max-pages MAX_PAGES]
327+ [--results-per-file RESULTS_PER_FILE]
328+ [--filename-prefix FILENAME_PREFIX]
329+ [--no-print-stream] [--print-stream] [--debug]
285330
286331 optional arguments:
287332 -h, --help show this help message and exit
@@ -319,10 +364,10 @@ Full options are listed below:
319364 Number of results to return per call (default 100; max
320365 500) - corresponds to 'maxResults' in the API
321366 --max-results MAX_RESULTS
322- Maximum results to return for this session (defaults
323- to 500; see -a option
367+ Maximum number of Tweets or Counts to return for this
368+ session (defaults to 500)
324369 --max-pages MAX_PAGES
325- Maximum number of pages/api calls to use for this
370+ Maximum number of pages/API calls to use for this
326371 session.
327372 --results-per-file RESULTS_PER_FILE
328373 Maximum tweets to save per file.
@@ -336,7 +381,7 @@ Full options are listed below:
336381--------------
337382
338383Using the Twitter Search APIs' Python Wrapper
339- ============================================
384+ =============================================
340385
341386Working with the API within a Python program is straightforward both for
342387Premium and Enterprise clients.
@@ -621,21 +666,27 @@ Our results are pretty straightforward and can be rapidly used.
621666Dated searches / Full Archive Search
622667------------------------------------
623668
624- Let's make a new rule and pass it dates this time.
625-
626- ``gen_rule_payload `` takes dates of the forms ``YYYY-mm-DD `` and
627- ``YYYYmmDD ``.
628-
629669**Note that this will only work with the full archive search option **,
630670which is available to my account only via the enterprise options. Full
631671archive search will likely require a different endpoint or access
632672method; please see your developer console for details.
633673
674+ Let's make a new rule and pass it dates this time.
675+
676+ ``gen_rule_payload `` takes timestamps of the following forms:
677+
678+ - ``YYYYmmDDHHMM ``
679+ - ``YYYY-mm-DD `` (which will convert to midnight UTC (00:00)
680+ - ``YYYY-mm-DD HH:MM ``
681+ - ``YYYY-mm-DDTHH:MM ``
682+
683+ Note - all Tweets are stored in UTC time.
684+
634685.. code :: python
635686
636687 rule = gen_rule_payload(" from:jack" ,
637- from_date = " 2017-09-01" ,
638- to_date = " 2017-10-30" ,
688+ from_date = " 2017-09-01" , # UTC 2017-09-01 00:00
689+ to_date = " 2017-10-30" ,# UTC 2017-10-30 00:00
639690 results_per_call = 500 )
640691 print (rule)
641692
0 commit comments