如何从代码configurationnltk数据目录?

如何从代码configurationnltk数据目录?

只需更改nltk.data.path项目,这是一个简单的列表。

从代码中, http : //www.nltk.org/_modules/nltk/data.html :

 ``nltk:path``: Specifies the file stored in the NLTK data package at *path*. NLTK will search for these files in the directories specified by ``nltk.data.path``. 

然后在代码中:

 ###################################################################### # Search Path ###################################################################### path = [] """A list of directories where the NLTK data package might reside. These directories will be checked in order when looking for a resource in the data package. Note that this allows users to substitute in their own versions of resources, if they have them (eg, in their home directory under ~/nltk_data).""" # User-specified locations: path += [d for d in os.environ.get('NLTK_DATA', str('')).split(os.pathsep) if d] if os.path.expanduser('~/') != '~/': path.append(os.path.expanduser(str('~/nltk_data'))) if sys.platform.startswith('win'): # Common locations on Windows: path += [ str(r'C:\nltk_data'), str(r'D:\nltk_data'), str(r'E:\nltk_data'), os.path.join(sys.prefix, str('nltk_data')), os.path.join(sys.prefix, str('lib'), str('nltk_data')), os.path.join(os.environ.get(str('APPDATA'), str('C:\\')), str('nltk_data')) ] else: # Common locations on UNIX & OS X: path += [ str('/usr/share/nltk_data'), str('/usr/local/share/nltk_data'), str('/usr/lib/nltk_data'), str('/usr/local/lib/nltk_data') ] 

要修改path,只需追加到可能的path列表:

 import nltk nltk.data.path.append("/home/yourusername/whateverpath/") 

或在Windows中:

 import nltk nltk.data.path.append("C:\somewhere\farfar\away\path") 

我使用append,例子

 nltk.data.path.append('/libs/nltk_data/') 

对于那些使用uwsgi:

我遇到了麻烦,因为我想要一个uwsgi应用程序(作为不同于我的用户运行)访问我以前下载的nltk数据。 我的工作是myapp_uwsgi.ini添加到myapp_uwsgi.ini

 env = NLTK_DATA=/home/myuser/nltk_data/ 

这将按照NLTK_DATAbuild议设置环境variablesNLTK_DATA。
进行此更改后,您可能需要重新启动您的uwsgi进程。