How to undertake train and be optimizinged to linguistic model of rich full small artificial intelligence?
It is OK that to Fu Manwei artificial intelligence linguistic model undertakes train and optimizing from proceed with of the following respects:
One, data preparation
Collect high quality data
- Collect the text data that comes from different domain, different type extensively, include but not content of documentation of article of news of be confined to, paper, novel, technology, gregarious media. Ensure the diversity of data is mixed representative, in order to raise the model suiting to all sorts of language setting ability.
- Can get data from data of interior of public data source, enterprise and cooperative channel. In the meantime, want to note the copyright problem of data, ensure use legally.
Data is cleaned and pretreatment
- The data to be being collected undertakes cleaning, purify noise, mistake and repeat content. This can include purify the standardization of special character, punctuation mark, correct the operation such as spelling mistake.
- Tag into word of devoir of style or manner of writing, syntactical functions and morphological features that help to determine a part of speech wait for pretreatment measure, so that the model understands better,mix processing text. Can use tool of existing Natural Language Processing or library to undertake these pretreatment are operated.
Data is tagged and increase
- According to particular training job, undertake tagging to data. For example, if undertake affection analyses the task, can tag the affection polarity of text; If undertake naming entity identifying the task, can tag the hypostatic name in text.
- Can enhance a technology through data, if be replaced randomly, delete, insert the method such as the word, increase the diversity of data, the extensive that increases a model influences ability.
2, model framework chooses and adjust
Choose appropriate model framework
- Linguistic model of rich full small artificial intelligence may have a variety of different frameworks to be able to offer an alternative, be like network of nerve of Transformer framework, loop (RNN) etc. According to specific applied requirement and computational resource, choose suitable model framework.
- The dimensions that considers a model and complex degree, and the property performance on different job. Can consult some studies and already carried out experience, the choice passes the efficient model framework of test and verify.
Adjust model parameter
- To the model exceed parameter to undertake adjustment, if learn size of rate, batch, conceal a size, number of plies to wait. These parameter can affect the training rate of the model and performance.
- Can use reseau search, random search to wait exceed parameter to optimize a method, find first-rate parameter to combine. In the meantime, want to notice to had avoided to plan to close and owe the problem that plans to close.
3, training course is optimized
The choice optimizes algorithm suitably
- Optimize algorithm commonly to random gradient drops (SGD) , Adam, Adagrad. Optimize algorithm differently to differ in the expression below different situation, need chooses to optimize algorithm suitably according to the characteristic of model and data.
- Can try to optimize algorithm differently, compare them to be behaved in the function in training a process, choose the algorithm with best effect.
Use distributed training
- If computational resource allows, can use distributed training technology, get on training task allocation to many computation node and undertake all right. This can shorten greatly training time, improve training efficiency.
- Can use distributed training frame, if the distributed data of the distributed strategy of TensorFlow or PyTorch is collateral,wait.
Monitoring trains a process
- In training a process, want the function quota of close monitoring model, if loss function value, accuracy rate, recall is led,wait. Through observing the change of these index, can understand the training plan of the model and property performance.
- Can use visible tool, wait like TensorBoard, will show the index change in training a process and model structure intuitionisticly.
4, the model is evaluated and optimize
Undertake the model is evaluated
- After training is finished, want to undertake assessment to the model, in order to decide whether its function satisfies a requirement. Can use the method that checks collect or alternate test and verify, the index such as value of rate of the accuracy rate that evaluates a model to go up in different job, recall, F1.
- In the meantime, want to undertake be assessmented artificially to the model, the output result that checks a model is accurate. Can invite domain expert or user to undertake assessment to the output of the model, gather feedback opinion.
Undertake the model is optimized
- The basis evaluates a result, undertake optimizing to the model. If model function does not satisfy a requirement, can try to adjust model framework, exceed the respect such as data of parameter, training, in order to improve the performance of the model.
- Can use a model to compress a technology, like the lop, quantify wait for a method, reduce the size of the model and computational amount, improve the moving efficiency of the model.
Last to improve and be updated
- The linguistic model is a ceaseless development and improvement domain, want to pay close attention to newest research achievement and technical progress continuously, be improved ceaselessly and replace a model.
- Can collect new data regularly, undertake train afresh and be optimizinged to the model, the function with increasing a model and get used to ability.
Anyhow, to Fu Manwei artificial intelligence linguistic model undertakes course of choice of framework of preparation, model, training is optimized train and be optimizinged needing to consider data integratedly evaluate with the model wait for many respects. Through try ceaselessly and be being improved, can improve the function of the model and quality, satisfy the requirement of different application setting.