北京社交网站建设,视频网站后台设计,坦洲网站建设公司,黄江镇做网站一、创建虚拟环境
好习惯#xff0c;首先创建单独的运行环境
conda create -n uie python3.10.9
conda activate uie
二、安装paddle框架及paddlenlp
2.1 参考官方文档安装paddle
开始使用_飞桨-源于产业实践的开源深度学习平台
首先查看自己服务器cuda版本#xff0c;…一、创建虚拟环境
好习惯首先创建单独的运行环境
conda create -n uie python3.10.9
conda activate uie
二、安装paddle框架及paddlenlp
2.1 参考官方文档安装paddle
开始使用_飞桨-源于产业实践的开源深度学习平台
首先查看自己服务器cuda版本如下我的版本时10.2
(PyTorch-1.8) [ma-user work]$nvidia-smi
Wed Apr 19 23:35:11 2023
-----------------------------------------------------------------------------
| NVIDIA-SMI 440.33.01 Driver Version: 440.33.01 CUDA Version: 10.2 |
|---------------------------------------------------------------------------
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
||
| 0 Tesla P100-PCIE... Off | 00000000:00:0E.0 Off | 0 |
| N/A 39C P0 28W / 250W | 0MiB / 16280MiB | 0% Default |
--------------------------------------------------------------------------------------------------------------------------------------------------------
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
||
| No running processes found |
-----------------------------------------------------------------------------
在Paddle官网直接复制命令即可。 2.2 安装paddlenlp
pip install --upgrade paddlenlp
2.2.1 问题一 ERROR: Failed building wheel for numpy Failed to build numpy
-x86_64-3.10/numpy/core/src/multiarray/scalartypes.o -MMD -MF build/temp.linux-x86_64-3.10/build/src.linux-x86_64-3.10/numpy/core/src/multiarray/scalartypes.o.d failed with exit status 1[end of output]note: This error originates from a subprocess, and is likely not a problem with pip.ERROR: Failed building wheel for numpyFailed to build numpyERROR: Could not build wheels for numpy, which is required to install pyproject.toml-based projects[end of output]note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error× pip subprocess to install backend dependencies did not run successfully.
│ exit code: 1
╰─ See above for output.note: This error originates from a subprocess, and is likely not a problem with pip
手工安装numpy包再次执行nlp包安装还是不行。
pip install numpy
换另外一种方式成功
python3 -m pip install --upgrade paddlenlp -i https://mirror.baidu.com/pypi/simple
三、下载PaddleNLP源码
$git clone https://github.com/PaddlePaddle/PaddleNLP.git
四、执行训练
4.1、对标注数据进行预处理
python ../PaddleNLP/model_zoo/uie/doccano.py --doccano_file ./data.json --task_type ext --save_dir ./ --splits 0.7 0.2 0.1 --schema_lang ch
4.2、模型精调
$python ../PaddleNLP/model_zoo/uie/finetune.py --device gpu --logging_steps 10 --save_steps 100 --eval_steps 100 --seed 42 --model_name_or_path uie-base --output_dir $finetuned_model --train_path ./train.txt --dev_path ./dev.txt --max_seq_length 512 --per_device_eval_batch_size 16 --per_device_train_batch_size 16 --num_train_epochs 20 --learning_rate 1e-5 --label_names start_positions end_positions --do_train --do_eval --do_export --export_model_dir $finetuned_model --overwrite_output_dir --disable_tqdm True --metric_for_best_model eval_f1 --load_best_model_at_end True --save_total_limit 1
出现下图及训练成功 五、模型应用
from pprint import pprint
from paddlenlp import Taskflow
schema [时间, 地区, 指标名]
ie Taskflow(information_extraction, schemaschema, task_path./checkpoint/model_best)
pprint(ie(我想查询2022年山东省主营业务收入数据))