Webpy + jieba 搭建分词服务

前言

搭建一个webServer分词服务。使用开源框架webpy搭建webserver,使用jieba中文分词组件。

准备安装文件

准备文件 下载地址
Webserver (Webpy开源框架) http://webpy.org/static/web.py-0.38.tar.gz
Jieba分词组件(开源组件) https://github.com/fxsjy/jieba

Webpy Linux安装

1
2
3
4
5
6
7
8
9
10
11
12
[root@master201 Soft]# wget http://webpy.org/static/web.py-0.38.tar.gz
--2018-11-03 22:15:33-- http://webpy.org/static/web.py-0.38.tar.gz
Resolving webpy.org (webpy.org)... 192.30.252.153
Connecting to webpy.org (webpy.org)|192.30.252.153|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 91877 (90K) [application/gzip]
Saving to: ‘web.py-0.38.tar.gz’
100%[=========================================================================================================================>] 91,877 36.1KB/s in 2.5s
2018-11-03 22:15:38 (36.1 KB/s) - ‘web.py-0.38.tar.gz’ saved [91877/91877]
[root@master201 Soft]# tar -xvf web.py-0.38.tar.gz
[root@master201 Soft]# cd web.py-0.38/
[root@master201 web.py-0.38]# python setup.py install

Webpy Demo & 启动

main.py Demo程序

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import web                                              
import sys
urls = (
'/', 'index',
'/test', 'test',
)
app = web.application(urls, globals())
class index:
def GET(self):
┊ params = web.input()
┊ context = params.get('context', '')
return context
class test:
def GET(self):
print web.input()
return 'the first webpy application from test'
if __name__ == '__main__':
app.run()

启动webpy应用程序

1
2
3
4
5
6
7
^C[root@master201 web.py-0.38]# python main.py 8080
http://0.0.0.0:8080/

访问日志
<Storage {}>
192.168.152.1:64945 - - [04/Nov/2018 11:27:05] "HTTP/1.1 GET /test" - 200 OK
192.168.152.1:64945 - - [04/Nov/2018 11:27:05] "HTTP/1.1 GET /favicon.ico" - 404 Not Found

Webpy Jieba分词组件简单使用

  • 上传下载的jieba类库包

  • 以下代码引入的jieba类库包需要放置到webpy安装目录

    1
    [root@master201 Soft]# cp -rf /home/lishijia/Soft/jieba/jieba/ /home/lishijia/Soft/web.py-0.38
  • 分词处理代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import web
import sys

# 引入jieba类库包
import jieba
import jieba.posseg
import jieba.analyse

# http访问影射(对应的处理类)
urls = (
'/index', 'index',
'/test', 'test',
)

# 初始化web app
app = web.application(urls, globals())

# 解决返回乱码问题
web.header('Content-Type','text/json; charset=utf-8', unique=True)

# 定义处理/index的请求
class index:
def GET(self):
params = web.input()
# 获取请求参数context
input_context = params.get('context', '')
# 使用jieba对输入参数分词
seg_context = jieba.cut(input_context)
# jieba分词返回的是一个list,通过join的方式连接为字符串
return_context = ",".join(seg_context)
return return_context

#定义出路/test的请求
class test:
def GET(self):
print web.input()
return 'the first webpy application from test'

if __name__ == '__main__':
app.run()

**

参考链接

分享到 评论