" Artificial Intelligence"related to papers

Abstract:In recent years, face recognition has been a hot topic in society. Compared to contact-based recognition methods such as fingerprint recognition, face recognition offers the advantage of being contactless. In the field of deep learning, traditional convolutional neural networks do not achieve high enough accuracy or speed for face recognition. Therefore, this paper proposes a face recognition algorithm using the AlexNet convolutional neural network. Experimental results show that AlexNet provides higher accuracy and more stability in face recognition compared to traditional convolutional neural networks.

Abstract:Currently, with the large-scale integration of renewable energy into the distribution network, the intermittency and randomness of renewable energy sources such as solar and wind power inevitably cause fluctuations in the distribution network. Considering the characteristics of renewable energy generation power and electricity load in the power grid over time, a load prediction and optimization method based on wavelet transform and neural network for renewable energy access to the distribution network is proposed. Firstly, the grid operation data are collected, and the wavelet transform is used to process the collected data to obtain the feature parameters of local scale and frequency decomposition. A neural network is established. Then the feature parameters obtained after the wavelet transform are trained to obtain a model capable of predicting the load, according to which the power generation of renewable energy sources can be adjusted in time to maintain the dynamic balance between the supply and demand sides of the distribution network. The results show that the proposed method can effectively predict the load and regulate the power generation by observing the load in advance to ensure the stability of power consumption in the distribution network and simultaneously maximize the use of renewable energy.

Abstract:Current artificial visual systems still struggle to handle real-world scenarios involving high-speed motion and high dynamic range scenes. Event cameras have the capability to address these challenges due to their low latency and high dynamic range for capturing fast-moving objects. However, reconstructing events into videos while maintaining their speed presents a challenge due to the highly sparse and dynamic nature of event data. Therefore, this paper proposes an event stream reconstruction algorithm based on Transformer residual networks and optical flow estimation. By jointly training optical flow estimation and event reconstruction, a self-supervised reconstruction process has been achieved. Additionally, deblurring preprocessing and subpixel upsampling modules are introduced to enhance the quality of reconstruction. Experimental results demonstrate that the proposed approach effectively improves the reconstruction quality of event streams on public datasets.

Abstract:Mutations and abnormal expressions of miRNA can potentially lead to various diseases. Hence, predicting the latent correlation between miRNA and diseases holds significant importance for the advancement of clinical medicine and drug research. The topology structure constitutes a crucial component of miRNA-disease prediction algorithms. However, the current algorithms inadequately leverage the topological structure, resulting in suboptimal predictive outcomes. Simultaneously, effectively integrating multi-source data is a current research trend. In response to the aforementioned issues, this paper proposes an adaptive algorithm for fusing heterogeneous node structure information (MMTP). MMTP enhances miRNA-disease prediction accuracy by adaptively integrating heterogeneous node structure information through the utilization of first-order neighbors and metapath-induced network learning of structural features, employing metric learning and topology propagation. Results from a 5-fold cross-validation experiment demonstrate that MMTP achieves Area Under the Curve (AUC) of receiver operating characteristic values of 94.81 on the HMDD v3.2 datasets, surpassing other models. Moreover, in a case study focused on renal cancer, all of the top 30 miRNAs predicted by the model are confirmed. The aforementioned research confirms the efficacy of the proposed MMTP model in predicting miRNA-disease correlations.

Abstract:The advent of the global aging era has brought critical issues concerning the elderly health care to light, and indoor falls pose a significant safety risk to seniors who live alone. Therefore, in order to accurately detect the action of falling, this paper uses millimeter-wave radar 3D point cloud data for indoor fall detection and introduces a PointLSTM network based on an external attention mechanism to classify 3D point clouds over time. The millimeter-wave radar chip of the MIMO system collects the echo signal of human movements, and the signal processing part is realized by using the microcontroller integrated with the radar baseband processor, which can convert the raw data into a three-dimensional point cloud in real time, and improve the computing speed in point cloud processing and the overall performance of radar hardware. The PointLSTM network based on external attention mechanism enables spatial and temporal feature extraction and classification of point clouds. The network addresses the loss of point information between frames in PointLSTM and links features across all data during information extraction. The external attention mechanism, with its independent learnable parameters, optimizes network complexity and recognition accuracy. Experimental results show that the proposed method achieves a detection accuracy of 98.3% in indoor environments, effectively differentiating between types of motions and confirming the feasibility of using millimeter-wave radar 3D point clouds for detecting human falls.

Abstract:A prediction framework for the spatial structure of protein folding based on cloud computing is proposed and implemented. The original data of protein sequence is obtained through the data cloud storage unit and stored in the cloud using the HDFS distributed storage mode. After the resource and queue manager RQM (Requirements Quality Management) starts the cloud virtual machine, it is used as the Sensor Node which establishes the minimum protein molecular energy optimization function based on two-dimensional AB non-lattice model. The quantum genetic algorithm is adopted for local search mechanism to optimize its solution. The cloud GPU equipment is used to process the model training data to complete the automatic prediction of the spatial structure of protein folding. The experimental results show that the proposed approach can achieve the smaller calculation result of protein sequence energy potential function, the higher execution efficiency, and the higher GDT-TS (Geothermal Development and Testing Tool Suite) evaluation index value.

Abstract:Trees play a vital role in maintaining ecological balance, protecting biodiversity, regulating climate and improving air quality. In order to solve the problem of low tree identification accuracy in complex backgrounds, a tree species identification model MFFMN-KD-TA for common arbor in subtropics is proposed based on tree multi-feature fusion and knowledge distillation. The model uses three parallel MobileNetV3_Small backbone networks to extract features of leaves, trunks and overall trees respectively, and optimizes training by using knowledge distillation and embedding Triplet Attention modules. The test results show that the accuracy, precision and F1 score of the MFFMN-KD-TA model on the self-built tree test set are 0.960 9, 0.962 1 and 0.960 8 respectively, which are 3.05%, 2.83% and 3.07% higher than the MFFMN model respectively. Compared with the three-branch fusion models 3-ShuffleNetV2 and 3-MobileNetV2, the multi-feature fusion model MFFMN-KD-TA proposed in this study has a smaller number of parameters and can identify arbor species more accurately, providing a new idea and method for tree species identification in subtropics and other areas.

Abstract:Using active learning models to select the most valuable data points for annotation is one way that deep learning reduces the amount of labeled data required. Prediction loss models are a type of task-agnostic active learning models that perform well across multiple tasks. However, these models are not end-to-end models, and changing input features can lead to input bias during the training of the loss prediction network. This paper proposes a temporal feature fusion prediction loss model to address the issue of input bias in such models. Experiments demonstrate that the method proposed in this paper achieves an average performance improvement of approximately 1.5% across various tasks compared to previous state-of-the-art methods, and an average improvement of 5% compared to the original prediction loss model.

Abstract:Aiming at the problems of poor detection accuracy of current heart sound recognition algorithms, a new heart sound recognition method based on hybrid convolutional neural network-support vector machine model (CNN-SVM) is proposed. In order to verify the effectiveness of the method,two kinds of normal and abnormal heart sound signal databases based on PASCAL challenge experiment data are sorted out. Through preprocessing, MFCC feature extraction and PCA dimension reduction, CNN-SVM model has been input for training. The performance of the proposed method is evaluated in terms of accuracy, recall, specificity and F score.In order to verify the effectiveness of this algorithm, the hybrid CNN-SVM model is compared with the single SVM model and the CNN model respectively.Five groups of the experimental results show that the proposed method can distinguish the two different heart sound with a high average recognition rate of 99%, which is 2.48% higher than the single CNN method. It is also higher than the single SVM algorithm.

Abstract:Predicting protein-protein interaction (PPI) in plants holds significant biological implications. This study has employed four encoding methods and a deep neural network to construct a model for predicting protein interactions. The results show that the developed PPI prediction model using the integrated approach of the protein language model Ankh with a deep neural network has achieved optimal AUPR and AUC values across three plant datasets, with its Sen and MCC values also outperforming those of four other models designed for protein interaction predictions. When tested on plant PPI datasets for rice and soybean, the proposed model has yielded AUPR scores of 0.802 5 and 0.730 1 respectively, and AUC scores of 0.956 2 and 0.950 7 respectively. These outstanding results indicate that the PPI model incorporating the protein language model Ankh can serve as a promising tool for predicting protein-protein interactions in plants.

Abstract:Due to the low discrimination between objects and background texture in active millimeter wave images and the need for security in real time, a global channel attention booster-based method for active millimeter wave image object detection is proposed. In order to improve attention to the global channel information of the concealed object and improve detection performance when the concealed object could not be distinguished from the background texture, this method uses YOLOv5s as the carrier and adds global channel attention to the position direction of coordinate attention. And the K-Means ++ clustering method is used to create the anchor box for identifying concealed objects in millimeter wave images. The results demonstrate that both for array image dataset and line sweep image dataset, the detection model enhances the attention of hidden objects feature and improves the detection performance on the basis of meeting the security real-time performance.

Abstract:In this paper, a Bézier curve-based trajectory planning method with obstacle avoidance for intelligent vehicles, namely path planning and speed planning, is studied. In path planning, in order to adapt to roads of various shapes, the Cartesian coordinates of roads are converted to Frénet coordinates. Taking path length, curvature, continuity and vehicle collision risk as the cost function, the dangerous potential field theory is introduced to describe the risk of vehicle collision, and the sequential quadratic programming method is adopted to solve this nonlinear optimization problem. In speed planning, aiming at driving efficiency and comfort, speed planning is realized, which can meet different driving needs by adjusting the weight of each sub-objective function.

Abstract:Chinese named entity recognition has been involved with two tasks, including Chinese flat named entity recognition and Chinese nested named entity recognition. Chinese nested named entity recognition is more difficult. Therefore, this paper proposes a unified model, namely TLEXNER, based on lexicon enhancement and table filling, which can tackle the above two tasks concurrently. Aiming at the difficulty of Chinese word segmentation, the lexicon adapter is used to integrate the lexicon information into the BERT pre-training model,and integrates the relative position information of characters and lexical groups into the BERT embedding layer. Then conditional layer normalization and biaffine model is used to build and predict the representation of the character-pair table, and the relationship between character pairs is modeled by table structure to obtain the unified representation of flat entities and nested entities.

Abstract:Once a task-based question answering system is built, it is usually fixed and can answer very limited questions, making it difficult to meet user needs. A method for automatically updating the knowledge base in real-time was proposed. When a user asks a question that the question answering system cannot answer, the system will automatically send the question to the manual customer service. After the manual customer service used professional knowledge to reply, the system can automatically obtain the user's question and the answer replied by the manual customer service in real time, and automatically update the question answering pair to the knowledge base in real time. If other users ask similar questions, the question answering system can quickly provide corresponding to answers. Taking the question answering system in the field of government affairs as an example, the text vectorization method ERNIE was applied to build a question answering system that automatically updates the knowledge base in real time. After computer experiments, it has been proven that the proposed method can achieve automatic real-time updates of the knowledge base, and the constructed question answering system has autonomous learning and memory functions, improving the intelligence level of the task-based question answering system.

Abstract:Garbage classification is an important part of building ecological civilization. To solve the problem that heavyweight models are difficult to deploy to mobile devices, an improved garbage image classification method based on YOLOv5 network is proposed. The backbone network fused with GhostNet is used to replace the traditional convolutional operation with linear operation, which reduces the number of parameters of the model and improves the model inference speed. By adding an improved version of channel attention module to the network, the important channel features are strengthened and more deep-level feature information is obtained. The weighted boundary fusion method is used to improve the localization accuracy of the detection frame. It is experimentally demonstrated that the method improves the accuracy by 8.5%, reduces the parameter quantity by 46.7%, and increases the average inference speed by 1.22 ms in the homemade dataset compared with the original model, achieving a comprehensive improvement of accuracy and inference speed.

Abstract:To address the issues of insufficient semantic representation and incomplete feature extraction in Chinese event extraction, a method based on RoBERTa and multi-level features is proposed. Firstly, by using the pre-trained RoBERTa model, word embeddings are constructed and extended based on syntactic and semantic information of trigger words. Specifically, part-of-speech tags and trigger word embeddings are integrated into the word embeddings. Secondly, global and local features are extracted using a bi-directional long short-term memory network and convolutional neural network, respectively. The self-attention mechanism is employed to capture the relationships among different features, emphasizing the utilization of important features. Finally, a conditional random field is used to achieve BIO sequence labeling, completing the event extraction process. On the DuEE1.0 dataset, the F1 scores of trigger word extraction and event argument extraction reach 86.9% and 68.0%, respectively, which are superior to existing common event extraction models, validating the effectiveness of this method.

Abstract:Aiming at the problems of low accuracy of simple text classification model, complex structure of pre-training model and difficult to be directly used in practical environment, this paper proposes a text classification method based on multi-teacher model knowledge distillation. This model uses the training method of "teacher-student network", and the teacher model is the BERT-wwm-ext and XLNet pre-training models. The probability matrix of the output of the two models is fused into soft labels by weight coefficient. The student model is BiGRU-CNN network. The mean square error function is used to calculate the soft label error, and the cross-entropy loss function is used to calculate the hard label error. The student model is trained by hard label and soft label to minimize the value of the loss function. The test results show that the accuracy of the proposed method have great improvement compared with the student model, and it is close to the pre-training model, which can save the running time and improve the efficiency on the premise of ensuring the classification accuracy.

Abstract:The background of the power transmission line inspection image is complex, and the target detection is easy to be disturbed. Based on YOLOX neural network model, this paper proposes a method of power transmission line mountain fire detection. Firstly, the backbone feature extraction network framework of YOLOX is adopted, and the conventional convolution of the multi-scale feature extraction module is replaced by deformable convolution. Secondly, the fusion of channel attention and spatial attention modules is added in the enhanced feature extraction stage, which can adapt to the variable shape of flames, extract mountain fire features more effectively, and thus improve the accuracy of target detection. The experiment verifies the effectiveness of the proposed method.

Abstract:Human key point detection has important applications in intelligent video surveillance, human-computer interaction and other fields. Aiming at the problem that the human key point detection algorithm based on heatmap depends on high-resolution heatmap and consumes large computational resources, a lightweight algorithm combined with uncertainty estimation is proposed. The reliability of prediction results is improved by using low resolution heatmap and combining uncertainty to estimate the scale parameters of prediction error distribution. The scale parameter is used to monitor and constrain the heatmap to alleviate the gradient disappearance and enhance the robustness of the network. The experiments on COCO dataset show that the average accuracy of the improved algorithm is improved by 3.3% and the resource occupation is reduced compared with integral pose regression.

Abstract:In recent years, with the development of deep learning technology, the research and application of image segmentation methods based on coding and decoding in the automatic analysis of pathological images have gradually become widespread. However, due to the complexity and variability of gastric cancer lesions, large scale changes, and the blurring of boundaries caused by digital staining images, segmentation algorithms designed solely from a single scale often cannot obtain more accurate lesion boundaries. To optimize the accuracy of gastric cancer lesion image segmentation, this paper proposes a gastric cancer image segmentation algorithm based on multi-scale attention fusion network using the coding and decoding network structure. The coding structure uses EfficientNet as the feature extractor. In the decoder, the deep supervision of the network is realized by extracting and fusing the features of different levels of multi-path. When outputting, the spatial and channel attention is used to screen the multi-scale feature map for attention. At the same time, the integrated loss function is used in the training process to optimize the model.The experimental results show that the Dice coefficient score of this method on the SEED data set is 0.806 9, which to some extent achieves more refined gastric cancer lesion segmentation compared to FCN and UNet series networks.

Abstract: Neural machine translation for the legal domain is of great value for application scenarios such as contract text translation. Due to the scarcity of bilingual corpora in the legal domain, the machine translation performance is still not satisfactory. A practical method to address this problem is to integrate prior knowledge such as translation memory(TM) or templates. However, texts in the legal domain mostly have fixed expression structures and precise wording specifications. The performance of translation in the legal field can be further improved by using both sentence structure information and semantic information in the translation memory. Based on this, this paper proposes a new framework that uses monolingual TM and performs learnable memory retrieval in a cross-language manner. Firstly, this monolingual translation memories contain translation memory and translation template, which can provide richer external knowledge to the model. Secondly, the retrieval model and the translation model can be jointly optimized. Experiments on the MHLAW dataset show that this model surpasses baseline models up to 1.28 BLEU points.

Abstract: With the development of power user information collection system, richer user electricity consumption information is used for the identification of user electricity consumption information anomalies. In this paper, a false data injection based on the FDI attack is performed to construct a dataset of user electricity consumption information anomalies, and an improved stacking integrated classification algorithm based on recall is proposed. K-nearest neighbors algorithm (KNN), random forest model (RF), support vector machine (SVM) and gradient decision tree (GBDT) are used as the scheme of base classification model of the stacking structure. Logistic regression (LR) is used as a meta-classification model of the stacking structure. The output of the basic classification model is weighted based on the recall rate, which is used as the input data set of the meta-classification model. The proposed improved stacking classification algorithm based on recall is shown to be more efficient than the traditional stacking classification algorithm.

Abstract: In order to minimize the accuracy loss when compressing huge deep learning models and deploying them to devices with limited computing power and storage capacity, a knowledge distillation model compression method is investigated and an improved multi-teacher model knowledge distillation compression algorithm with filtering is proposed. Taking advantage of the integration of multi-teacher models, the better-performing teacher models are screened for student instruction using the predicted cross-entropy of each teacher model as the quantitative criterion for screening, and the student models are allowed to extract information starting from the feature layer of the teacher models, while the better-performing teacher models are allowed to have more say in the instruction. The experimental results of classification models such as VGG13 on the CIFAR100 dataset show that the multi-teacher model compression method in this paper has better performance in terms of accuracy compared with other compression algorithms with the same size of the final obtained student models.

Abstract:The development of LiDAR technology provides abundant 3D data for autonomous driving. However, LIDAR point cloud is actually incomplete 2.5D data due to signal loss caused by occlusion and some reflective materials, which poses a fundamental challenge to 3D perception. To solve this problem, this paper proposes a method for 3D completion of the original data. According to the symmetric shape and high repetition rate of most objects, the complete shape of the occluded part in the point cloud is estimated by learning the prior object shape. The method first identifies regions affected by occlusions and signal loss, and in these regions, predicts the occupancy probability of the shapes of objects contained in the regions. For the case of occlusion between objects, 3D completion is performed through the occupancy probability of the shape and the morphologies that share the same shape. The objects occluded by themselves are restored by mirroring themselves. Finally, it is learned through the point cloud target detection network. The results show that this method can effectively improve the mAP for generating point cloud 3D borders.

Abstract:Aiming at the problem of the high probability of miss and false positive rates in single-view Mammography, an improved automatic detection algorithm is proposed in this paper. The dilated residual network (DRN) combined with a modified feature pyramid network (FPN) is used for the detection of breast masses. The expansion convolution in DRN is used to reduce the number of downsampling of images. The number of layers of the DRN is also increased to satisfy the required input of the FPN. In the FPN structure, the attention mechanism is used to reduce the information loss caused by the direct fusion of different feature maps, while dense connections are used instead of the original lateral connections to make full use of the location and detailed information on the target for the shallow features. Simulation experiments show that the detection accuracy of the designed model on the CBSI-DDSM dataset is improved by 7.1 percent compared to the baseline.

Abstract: In this paper, an Hammerstein-Wiener model based on extreme learning machine is built to identify Continuous Stirred Tank Reactor(CSTR) nonlinear system which is used in chemical process widely. In the proposed Hammerstein-Wiener model, the two nonlinear blocks are described by two different extreme learning machine neural networks. The linear block is described by ARX model. Due to the special structure of the extreme learning machine, this model can be expressed in the form of linear regression. The model parameter identification is achieved by generalized least square algorithm. The identification process is simple with less computation complexity. The simulation result shows that this proposed approach is effective. Compared with Hammerstein model based polynomial and ARX-LSSVM Hammerstein model,the proposed method has higher identification accuracy.

Abstract:In order to solve the problem of indiscriminate parking of shared bikes in cities, a standard parking system for shared bikes based on OpenMV was proposed. The system is mainly composed of OpenMV development board, BC20 communication positioning module and OneNET cloud server. Among them, OpenMV completes the detection and recognition of the parking state, and uses the built-in network module of BC20 to conduct data interaction with the OneNET cloud server, and feedback the detection results to the small program client, so as to realize the parking detection and control of shared bikes. Through a lot of tests, the experimental results showed that the system runs stably, and the recognition accuracy rate is 94.1%, which can realize the standard parking automatic detection of shared bikes, and has a good market application prospect and value.

Abstract:In order to solve the problem of high difficulty and large amount of computation in feature extraction for web spam detection, a method for extracting semantic features only based on the HTML script of the current page is proposed. Firstly, the domain name is segmented by a memorization search algorithm combining depth-first search and dynamic programming. Secondly, The latent Dirichlet distribution is used to extract subject words of the web page. Lastly, three single-page semantic similarity features are calculated based on Word2Vec and word mover distance. Combining the single-page semantic similarity features with single-page statistical features, classification algorithms such as random forest are used to build classification models for web spam detection. The experimental results show that the AUC value of single-page content extraction based on semantic and statistical features for classification reaches 88.0%, which is about 4% higher than that of the control method.

Abstract:Most traditional deep learning point cloud complement learning methods only use the global features and ignore the local features. In order to better extract and use the local features of point cloud, an end-to-end cloud completion network based on deep learning is proposed in this paper. On the basis of point cloud completion network (PCN), the coding part introduces dynamic graph convolution (DGCNN) improved for local features. The edge convolution of multiple different dimensions is used to extract more abundant local features, and weaken the characteristics of the far point according to the distance. Then the network structure is optimized with the idea of deep residual network connection to achieve the fusion of multi-scale features, and the mean pooling method is added to compensate for the information loss caused by global pooling. In the decoder part, FoldingNet was used to make the output point cloud complete. The experimental results show that the point cloud completion network is partially improved compared with PCN and other point cloud completion networks, which verifies the effectiveness of the new method.

Abstract: Feature extraction and analysis of medical text is of great practical value in building clinical decision support systems. A medical text analysis model based on BERT and Word2vec is proposed for the situation that raw medical texts containing various terms and abbreviations are difficult to extract features. The model extracts key medical entities from medical records and establishes a weight scoring mechanism based on knowledge for semantic analysis of medical texts. The experimental data show that the model has certain advantages in medical text feature extraction, good performance in the analysis and diagnosis of hypertensive intracerebral hemorrhage medical records, and can be effectively used in clinical decision support systems.

Abstract:The Scene Graph Generation (SGG) task aims to detect visual relation triples in images, i.e. subject, predicate and object, to provide a structural visual layout for scene understanding. However, existing approaches to scene graph generation ignore the high frequency but uninformative problem of predicted predicates, hindering progress in this field. In order to solve the above problems, this paper proposes a scene graph generation algorithm based on enhanced semantic information understanding. The whole model consists of four parts: feature extraction module, image cropping module, semantic transformation module and extended information predicate module. Feature extraction module and image cropping module are responsible for extracting visual features and making them global and diverse. The semantic transformation module is responsible for restoring the semantic relationship between predicates from common predictions to informative predictions. The extended information predicate module is responsible for extending the sampling space of the information predicate. Comparing with other methods on datasets VG and VG-MSDN, the average recall reaches 59.5% and 40.9%, respectively. The algorithm in this paper can improve the problem of insufficient information of the predicted predicate, and then improve the performance of the scene graph generation algorithm.

Abstract: In order to solve the problem of perception and no decision-making in the agricultural modernization of Xinjiang Corps, an image classification method (TL-DA-SE-CNN) based on attention mechanism module (SENet) and convolutional neural network hybrid model transfer learning is proposed. This method selects four different CNN models for weight acquisition, including VGGNet, ResNet, InceptionNet and MobileNet. The model uses the SENet classifier instead of the fully connected layer of the convolutional neural network, extracts the structural high-order statistical features of the image for topic classification, and uses the BP algorithm to adjust the parameters, with a classification accuracy of 98.20%. Experimental results show that the technology of combining CNN with transfer learning, data augmentation and SENet improves the performance of livestock image classification, which is an effective application of convolutional neural network in farm automation clustering.

Abstract:In order to achieve the goal of reducing taxi operating losses, the quantification of losses should first be completed. The average time cost of waiting for drivers is quantified by queuing theory, and the "Technique for Order Preference by Similarity to an Ideal Solution (TOPSIS)" model improved by entropy method quantifies the possibility of carrying passengers in different periods. The possibility is used to complete the quantification of potential loss parameters of no-load. Through the quantitative indicators, a taxi operation loss model is constructed. Based on this model, an operational decision-making scheme to reduce losses is proposed. Finally, the verification of flight and taxi data quoted from Luoyang Beijiao airport proves that the decision-making method of this paper can effectively reduce the operational loss of taxis.

Abstract:Abstract: With the increasing resolution of the input image of the current target detection task,the feature information extracted from the feature extraction network will become more and more limited under the condition that the receptive field of the feature extraction network remains unchanged,and the information coincidence degree between adjacent feature points will also become higher and higher.This paper proposes an FSA(fusion self-attention)-FPN,and designs SAU(self-attention upsample) module.The internal structure of SAU performs cross calculation with self-attention mechanism and CNN to further Feature fusion,and reconstructs FCU(feature coupling unit) to eliminate feature dislocation between them and bridge semantic gap. In this paper,a comparative experiment is carried out on Pascal VOC2007 data set using YOLOX-Darknet 53 as the main dry network. The experimental results show that compared with the FPN of the original network,the average accuracy of MAP@ [.5:.95] after replacing FSA-FPN is improved by 1.5%,and the position of the prediction box is also more accurate.It has better application value in detection scenarios requiring higher accuracy.

Abstract:Abstract: In view of the complex structure and high maintenance cost of transformers, this paper proposes a transformer fault signal recognition algorithm based on deep learning. Firstly, the voiceprint signal under the working condition of the transformer is analyzed and the two-dimensional image signal is converted. Based on the advantages of VGG16 neural network in the image, a MCA attention mechanism is proposed, which can retain both background information and detail information. Secondly it optimizes the maximum pooled down sampling in VGG16, and adopts a soft pooled sampling method to reduce the feature loss caused by the maximum pooled down sampling in the image. Finally, in order to avoid the occurrence of over fitting, the activation function in the top structure of VGG16 is optimized, and the SELU activation function that can be self normalized is quoted. The experiment proves that the generalized S-transform is the best choice for converting one-dimensional time-domain signal to two-dimensional image signal, and the average recognition rate of the proposed algorithm for six types of fault signals reaches 99.15%.

Abstract:Abstract: Deep neural networks have achieved great success in classification and prediction of high-dimensional data. Training deep neural networks is a data-intensive task, which needs to collect large-scale data from multiple data sources. These data usually contain sensitive information, which makes the training process of convolutional neural networks easy to leak data privacy. Aiming at the problems of data privacy and communication cost in the training process, this paper proposes a distributed training method for deep neural networks, which allows to jointly learn deep neural networks based on multiple data sources. Firstly, a distributed training architecture is proposed, which is composed of one computing center and multiple agents. Secondly, a distributed training algorithm based on multiple data sources is proposed, which allows to distributed jointly train models through the splitting of convolutional neural networks under the constraints that raw data are not shared directly and the communication cost is reduced. Thirdly, the correctness of the algorithm is analyzed. Finally, the experimental results show that our method is effective.

Abstract:To improve safety during ordinary railway catenary maintenance operations and the anti-interference of transmission information, a set of ordinary railway catenary condition monitoring system based on artificial intelligence robot technology and LoRa communication technology is designed. The system is mainly composed of a robot that can inspect electricity, hook ground wire and return line in the job site as a data acquisition device and the pattern constitution of transmitting and processing status information outside the job site and realizing visualized monitoring at the terminal. After the monitoring and simulation test, the anti-interference results of the robot data collection are in line with expectations. Meanwhile, the data transmission between the LoRa gateway and the cloud server is normal, and visualized monitoring of the status information from the PC side is realized.

Abstract:Aiming at the problem that the accuracy of text emotion classification of online comment is not high, an improved model based on BERT and bidirectional gated recurrent unit (BiGRU) is proposed. The word vector representation is carried out by using the BERT model which can represent the rich semantic features of the text. The classification effect of the model is improved by combining the BiGRU neural network which can retain the text context related information for a long time. On this basis, the attention mechanism is introduced, to highlight the weight of emotional words which can better express the classification results in the text, and improve the accuracy of emotional classification. The above model was tested on Acllmdb_v1 data set and hotel reviews data set, which are public data set. The experimental results show that the model achieves good performance in both Chinese and English text emotion classification tasks.

Abstract: Road traffic sign′s detection is one of the important links of intelligent transportation. A detection method based on the improved YOLOv3 model by the industry is proposed for the problems of complex background, small targets and slow detection speed in traffic sign detection.The method used a bidirectional feature pyramid structure to achieve bidirectional fusion of semantic information of low,middle and high level features of images to improve the classification of low-level prediction targets and the localization of high-level prediction targets. The main feature extraction network of the original model is improved, and the Darknet23 network is proposed to improve the extraction ability of the network and reduce the computational burden.According to the characteristics of the target shape, the K-means clustering algorithm for training the appropriate anchor frames and a more flexible L_(α-CIOU) loss function is introduced into the bounding box regression to make the network optimize towards a higher degree of overlap between the prediction boxes and the ground-truth boxes. The experimental results show that the method reaches 86.10% mAP@0.75 and 70.017% mAP@0.5:0.05:0.95 on the CCTSDB dataset, which are 10.17% and 5.656% higher than the original network, the number of parameters is reduced by 3 622 091 and the speed is improved 8.27 f/s ,which is better than mainstream detection networks such as SSD and Faster RCNN.

Abstract:In recent years, after the era of big data has entered human life, many unrecognizable text, semantic and other data have appeared in people's lives, which are very large in volume and intricate in semantics, which makes the classification task more difficult. How to make computers classify this information accurately has become an important task of current research. In this process, Chinese news text classification has become a branch in this field, which has a crucial role in the control of national public opinion, the understanding of users' daily behavior, and the prediction of users' future speech and behavior. In view of the shortage of news text classification models with large number of parameters and long training time, the BERT-CNN based knowledge distillation is proposed to compress the training time while maximizing the model performance and striving for a compromise between the two. According to the technical characteristics of model compression, BERT is used as the teacher model and CNN is used as the student model, and BERT is pre-trained first before allowing the student model to generalize the capability of the teacher model. The experimental results show that the model parametric number compression is about 1/82 and the time reduction is about 1/670 with the model performance loss of about 2.09%.

Abstract:Lip reading is crucial in the silent environment or environments with serious noise interference, or for people with hearing impairment. For word-level Chinese lip reading problem, SinoLipReadingNet model is proposed, the front end of which with Conv3D and ResNet34 is used to extract temporal-spatial features, and the back end of which with Conv1D and Bi-LSTM are used for classification and prediction respectively. Also, self-attention and CTCLoss are added to improve the back end with Bi-LSTM. Finally,the SinoLipReadingNet model is tested on XWBank lipreading dataset and results show that the prediction accuracy is significantly better than that of D3D model, the prediction accuracy and avrage CER of multi-model fusion reaches 77.64% and 21.68% respectively.

Abstract:In order to solve the problem that common target detection algorithms are difficult to apply effectively in classroom scenarios, a student behavior detection algorithm combining lightweight and trapezoidal structure is proposed. The algorithm is based on YOLOv4 architecture, according to the characteristics of target classification and distribution space, a new “trapezoidal” feature fusion structure is proposed, and combined with the MobileNetv2 idea, the model parameters are optimized to obtain a trapezoidal-MobileDarknet19 feature extraction network, which not only reduces the computational load of the network, but also improves the work efficiency. At the same time, it strengthens the information transmission of target features and improves the learning ability of the model. In the scale detection stage, a five-layer DenseNet network is introduced to enhance the network′s detection ability for small targets. The experimental results show that the proposed YOLOv4-ST algorithm is better than the original one. The mAP of YOLOv4 algorithm is improved by 5.5%. Compared with other mainstream algorithms, it has better practicability in the task of student classroom behavior detection.

Abstract:In the complex scene of shooting range test, the test site often involves the changeable natural environment including dust, strong light, occlusion, etc. A single target tracking algorithm associated with dynamic features is proposed to track fast moving targets in this case. Firstly, the gated recurrent unit is used to extract the time series dynamic characteristics of the target which need to be tracked, so as to obtain a set of candidate processing target frames. Then,convolutional network is adopted to extract the depth convolution features of the candidate target frame and determine target position, as well as separating the background convolution features. In the tracking process, the separated background convolution feature map is applied to update network parameters to enhance the robustness and adaptability of network. Experimental results show that the proposed algorithm can adaptively track moving target in the shooting range image acquisition system, which can still maintain excellent robustness and adaptability in the context of complex environment.

Abstract:Entity alignment is an important technical method to realize the fusion of knowledge bases from different sources. It is widely used in the fields of knowledge graph and knowledge completion. The existing entity alignment models based on graph attention mostly use static graph attention network and ignore the semantic information in entity attributes, resulting in the problems of limited attention, difficult fitting and insufficient expression ability of the model. To solve these problems, this paper studies the entity alignment method based on the structure modeling of dynamic graph attention. Firstly, the single hop node representation of the target entity is modeled by GCN. Secondly, the multi hop node attention coefficient is obtained and entity modeled by using the dynamic graph attention network, and then the single hop and multi hop node information output by GCN and dynamic graph attention layer is aggregated by layer-wise gating network. Finally, the entity attribute semantic extracted by external knowledge pre training natural language model is embedded and concatenated to calculate similarity. This method has been improved in three types of cross language datasets of DBP15K, which proves the effectiveness of applying dynamic graph attention network and integrating entity attribute semantics in improving entity representation ability.

Abstract:With the rapid development of various industrial fields, the market demand for thin specifications, high strength strip products increases rapidly. The cross-section shape of hot rolled strip is the main evaluation index of hot rolled strip product quality. Based on data mining technology, the data in the mill database are analyzed and processed. The data mining technology combines deep belief neural network(DBN) and back propagation(BP) neural network algorithms to construct a prediction model of strip thickness distribution. The DBN-BP algorithm is composed of several restricted Botlzmann machines(RBM) stacked layer by layer, and the weight matrix and bias of the network are obtained by unsupervised layer-by-layer training method for the BP network, while the BP neural network fine-tunes the whole network by means of error back propagation. This method overcomes the disadvantages of BP network falling into local optimum due to random initialization of weight parameters and long training time. Compared with the BP algorithm, the probability of predicting the midpoint thickness error is within ±5.6 μm by the DBN-BP method is 95%, while the prediction error of BP algorithm is within ±11 μm. Through the analysis of the prediction results of the cross-section shape of the strip, it can be seen that the DBN-BP deep learning method has more advantages than the BP algorithm in predicting the edge thickness of the strip.

Abstract:Emotion is closely related to human behavior, family and society. Emotion can not only reflect all kinds of human feelings, thoughts and behaviors, but also the psychological and physiological responses produced by various external stimuli. Therefore, the correct identification of emotion is very important in many fields. The change of emotion will lead to the change of electroencephalogram(EEG) signal. On the contrary, these changes also reflect the change of emotional state. Based on the DEAP database, this paper extracts the time-domain and frequency-domain features of EEG signals, and reduces the dimension of the features by principal component analysis(PCA). The weighted KNN algorithm is used for 5-fold cross validation training. Finally, the recognition accuracy of excited, relaxed, depressed and angry emotions reaches 80%.

Abstract:In information warfare, the application of man-machine interaction in command and control equipment is increasingly widespread. Voice recognition method is used to replace keyboard, knob, key and other ways to change the parameters of the radio station, so that the operation is more intelligent and convenient. The system processes the voice collected by LD3320 speech recognition chip,and sends related protocols to control the serial operation of the radio station through the main control chip STM32F407. The experiment shows that the recognition rate of the system is about 95%, and it can realize the operation of radio frequency change and switch.

Abstract:Partial discharge is the phenomenon of dielectric discharge caused by uneven distribution of electric field under high electric field intensity. Partial discharge of equipment does great harm to the insulation layer. Rapid detection and identification of the discharge type of equipment is the guarantee of normal industrial operation. For electrical equipment for partial discharge type recognition problem, considering the electrical equipment monitoring system in the diagnosis of the timeliness and accuracy of recognition, this paper puts forward the partial discharge pattern recognition method based on edge calculation, using the advantage of edge computing architectures, edge of reasoning based on training, the clouds, the complex recognition algorithm training optimization deployment in the clouds. The recognition algorithm with large computation is offloaded to the edge layer, while the feature extraction with small computation is reserved to the terminal device layer. The statistical characteristic parameters of pd were extracted by constructing pd phase distribution spectrum, and the generalized regression neural network model was optimized by particle swarm optimization algorithm. Finally, the statistical characteristic parameters were used as the input of the neural network to identify the discharge types. The results show that the proposed pattern recognition method has high recognition accuracy and efficiency.

Abstract:The existing recommendation algorithm based on graph neural network can make use of graph structure information to improve the recommendation effect, but the main graph structure revolves around a kind of interaction between users and items, but ignores multiple behaviors of users. For example, click, bookmark, share, add to shopping cart, etc., all express different semantics of users, and comment information may affect the next purchase intention of this type of item. To this end, a graph neural network recommendation algorithm based on user behavior and comment information is proposed. The algorithm learns the strength and semantics of user behavior through the graph convolutional network, and then uses the comment text graph to represent the preferences of users and products in the learning reviews, and finally combines them to improve the recommendation effect. According to the experimental results, it is found that the algorithm can improve the recommendation effect to a certain extent.

Abstract:With the scale development of electric vehicles, the load of charging stations has a certain impact on the power grid. In order to ensure the power grid run steadily, an electric vehicle charging load forecasting model based on the integration of eXtreme Gradient Boosting(XGBoost) and Light Gradient Boosting Machine(LightGBM) is proposed. This method uses the strategy of stacking integrated learning. Firstly, the base models of load forecasting are constructed based on XGBoost and LightGBM respectively. And then Ridge Regression(RR) algorithm is used to fuse the output results of the base models, the fusion result is the load forecasting value. Based on a variety of different load forecasting models, comparative experiments are carried out with the order data of charging station located in Jiading District, Shanghai. The results show that the load forecasting model constructed by this method has higher forecasting accuracy than the model based on single algorithm, and has certain theoretical and practical value for the smooth operation of power grid.

Abstract:A review of polarimetric Synthetic Aperture Radar(SAR) image classification technology is provided. Firstly, the application value of the classification is discussed, and the technical problems to be solve are proposed. Secondly, the general work flow for polarimetric SAR image classification algorithm is analysed. Thirdly, the domestic and foreign research status and main technical characteristics are compared, emphatically the polarimetric SAR image intelligent classification algorithm based on deep learning theory is summarized. Finally, combined with the development trend of SAR remote sensing,the further study directions of polarimetric SAR image intelligent classification technology are pointed out.

Abstract:With the rapid development of Internet of Vehicular, more and more vehicles′ applications are computation-intensive and delay-sensitive. Resource-constrained vehicles cannot provide the required amount of computation and storage resources for these applications. Edge computing(EC) is expected to be a promising solution to meet the demand of low latency by providing computation and storage resources to vehicles at the network edge. This computing paradigm of offloading tasks to the edge servers can not only overcome the restrictions of limited capacity on vehicles,but also avoid the high latency caused by offloading tasks to the remote cloud. In this paper, an efficient task offloading algorithm based on deep reinforcement learning is proposed to minimize the average completion time of applications. Firstly, the multi-task offloading strategy problem is formalized as an optimization problem. Secondly, a deep reinforcement learning is leveraged to obtain an optimized offloading strategies with the lowest completion time. Finally, the experimental results show that the performance of the proposed algorithm is better than other baselines.

Abstract:Object detection methods have great value in the application field of video surveillance. At present, it is difficult to realize real-time object detection in resource constrained video surveillance system. A object detection method based on improved YOLOv3-tiny is proposed. Based on the YOLOv3-tiny architecture, the algorithm optimizes the backbone network by adding feature reuse, and a fully-connected attention mix module is proposed to enable the network to learn more abundant spatial information, which is more suitable for object detection under resource constraints. The experimental data shows that compared with YOLOv3-tiny, the algorithm reduces the model volume by 39.2%, the amount of parameters by 39.8%, and improves the mAP of 2.7% on the VOC data set, which significantly reduces the occupation of model resources while improving the detection accuracy.

Abstract:Clinicians need to judge the classification of cystic liver echinococcosis through personal experience when diagnosing liver echinococcosis. The automatic detection and classification model of liver hydatid lesions based on target detection algorithm was studied to realize the automatic recognition and classification of ultrasonic images of liver hydatid disease. The YOLOv5l model was used as the target detection model for cystic liver echinococcosis, and the network model was trained using the local liver echinococcosis ultrasound image data set. The automatic detection and classification model of liver echinococcosis lesions based on YOLOv5L model and stochastic gradient descent algorithm(SGD) optimization algorithm can effectively detect five types of lesions. The average precision mean(mAP) is 88.1%. After testing, the test speed of the model can reach 40 f/s. The experimental results show that the automatic detection and classification model of liver echinococcosis lesions based on YOLOv5L and SGD algorithm can better identify the specific location of lesions and assist doctors in the diagnosis of liver echinococcosis.

Abstract:Vehicle attribute detection is a basic task, which can be applied to many downstream traffic vision tasks. This paper presents an improved vehicle attribute detection algorithm based on YOLOv5. Aiming at the problem of small target detection, this paper adds the convolution attention module to make the network model pay more attention to the small target object. Aiming at the problem of less sample types of the dataset, this paper improves the mosaic data enhancement method of YOLOv5. The self-gated activation function Swish is used to suppress noise, accelerate convergence speed, and improve the robustness of the model. In addition, this paper also makes a detailed vehicle attribute labeling based on the public vehicle dataset VeRi-776, and constructs a vehicle attribute dataset. The experimental results show that the average accuracy of the improved algorithm is 4.6 % higher than that of the original YOLOv5, which can accurately detect the general attributes of vehicle images and can be used for downstream tasks.

Abstract:The rapid increase of the number of cancer patients has attracted worldwide attention. Researchers are very concerned about the assessment of the carcinogenicity of compounds, but this is extremely challenging. In this paper, 341 kinds of experimental data were obtained, and the spatial atom feature combined with the spatial graph convolutional network(SGCN) was used to establish a model that could predict the carcinogenicity of compounds. The results showed that when compared to other models, the classification model of the SGCN was more suited to predicting the carcinogenicity of compounds and had an overall classification accuracy of 96.9%, which showed that the SGCN model could accurately classify chemicals and had considerable potential in practical applications.

Abstract:The existing air quality prediction methods rarely consider seasonal factors, and the prediction effect is not good. Therefore, an air quality prediction method based on improved binary chaotic crow search algorithm(BCCSA) and deep long short term memory neural network(LSTM) is proposed. Firstly, the method of seasonal adjustment is proposed to preprocess the collected original air quality data in order to eliminate the influence of season on prediction. Then, an improved BCCSA is proposed to optimize the air quality data. Finally, the self-attention mechanism is added to the deep LSTM to predict the air quality data. The experimental results show that this method can effectively improve the prediction accuracy of air quality.

Abstract:A feature screening method based on alpha wave and principal component analysis was proposed to solve the problem that the weakly correlated feature quantity would affect the classification accuracy in EEG motor imagery classification. Based on brain computer interface system, the EEG signals corresponding to left and right motor imagination tasks were generated by auditory stimulation and processed by wavelet packet decomposition, and then the α band signals of the EEG were reconstructed, so as to extract the α waveforms and extract the statistical features. Combined with PCA technology and SVM method, the weak correlation features are eliminated and classified. According to the selected data, the accuracy of the results is higher, and the accuracy of signal classification is improved from 90.1% to 94.0%.

Abstract:With the increase of human-computer voice interaction scenes in recent years, using microphone array speech enhancement to improve speech quality has become one of the research hotspots. Different from the ambient noise, the interfering speaker′s speech and the target speaker are the same speech signal in the multiple speaker separation scene, showing similar time-frequency characteristics, which poses a higher challenge to the traditional microphone array speech enhancement technology. For the multiple speaker separation scenario, the spatial response cost function of microphone array is constructed and optimized based on deep learning network. The desired spatial transmission characteristics of microphone array are designed through deep learning model training, so as to improve the separation effect by improving the beamforming performance. Simulation and experimental results show that this method effectively improves the performance of multiple speaker separation.

Abstract:With the growth of urban traffic jam, how to recommend the fastest driving route for end users has become a research focus. The core problem of route recommending is how to forecast the traffic condition of the route in future, when the user will drive on this route section. The traffic condition is influenced by many factors, like road condition itself, passing time, weather conditions and habits of the driver. Because traffic condition changes very fast and complicated, it is difficult to accurately predict directly. This paper proposed a traffic condition prediction model based on an improved M-order Markov chain, which is more efficient. The model was tested with the actual traffic data in Beijing, and got a good result.

Abstract:Wearing a mask can effectively prevent the spread of the virus. In order to reduce the consumption of a large number of human resources in manual inspection of mask wearing, this paper proposes a method of mask wearing detection and tracking based on deep learning, which is divided into two modules: detection and tracking. Based on the YOLOv3 network, the spatial pyramid pooling structure is introduced into the detection module to realize the feature fusion at different scales, then the loss function is changed to CIoU loss to reduce the regression error improve detection accuracy, and provides good conditions for the subsequent tracking module. The tracking module adopts the multiple object tracking algorithm Deep SORT to track the detected objects in actual time, which can effectively avoid repeated detection and better the tracking effect of the occluded targets. The test results indicate that the detection velocity of this way is 38 f/s, and the average accuracy value is 85.23%, which is 4% higher than the original YOLOV3 algorithm, and can achieve the effect of real-time detection of mask wearing.

Abstract:Based on the high sensitivity, low cost and easy portability of underwater robot for ocean detection, this paper studies the key binocular vision detection technology applied to the ranging of underwater target objects. Zhang Zhengyou′s calibration algorithm was adopted to obtain the parameters of the internal and external matrix model of the underwater binocular camera by using a 9×9 checkerboard, and SGBM stereo matching algorithm was adopted to enhance the image contrast and weaken the influence of image color spots. It ensures the robustness of the algorithm, improves the matching search speed, transforms the parallax map into depth map through matrix operation, and maps it into visual point cloud to construct the three-dimensional information of the target object.

Abstract:Problems such as low recommendation efficiency and recommendation quality to be improved generally exist in the traditional collaborative filtering recommendation algorithm. In order to improve and solve these problems, the collaborative filtering recommendation algorithm integrates mixed clustering with user interests and preferences, and the recommendation quality has been significantly improved after verification. Firstly, a multiple mixed clustering model of Canopy+ Bi-Kmeans was constructed according to the personal information of users. The proposed mixed clustering model was used to divide all users into multiple clusters, and the interest preferences of each user were fused into the generated clusters to form a new similarity calculation model. Secondly, the weight classification method based on TF-IDF algorithm is used to calculate the weight of users on labels, and the exponential decay function incorporating time coefficient is used to capture the change of users′ interest preference with time. Finally, weighted fusion is used to combine user preferences with mixed clustering model to match more similar neighbor users, calculate project scores and make recommendations. The experimental results show that the proposed method can improve the recommendation quality and reliability.

Abstract:With the improvement of ship intelligence level, shipboard teleconferencing system is of great significance to improve the emergency handling capacity and promote the construction of shipboard integrated network. Microphone array is an important voice front-end to ensure the voice effect as well as the multi-mode interaction of teleconferencing system. However, while the small size of ship cabins leads to the adoption of small-size array, strong reverberation caused by small cabins and noisy cabin noise also seriously degrade the performance of traditional microphone array algorithm. Considering the direction of arrival(DOA) estimation scenario of small-size array in complex environment of ship cabin, a lightweight Mask-DOA estimation neural network model is proposed in this paper. With this method, Mask algorithm is introduced into the DOA estimation neural network to reduce the noise and reverb interference, then the enhanced GCC-PHAT is extracted as the network feature, so as to realize the high-precision DOA estimation on the small-size microphone array. Simulation and experimental results show that the Mask-DOA model proposed in this paper is more robust and has better generalization ability in the complex environment of ship cabin.

Abstract:There are a lot of researches on traffic signs in the safe driving and automatic driving of vehicles. Due to the wide variety of traffic signs and the influence of various factors, the classification and detection of traffic signs is also a challenging problem. To this end, a traffic sign classification and detection method combining tags with real road scenes is proposed. The method is divided into a data generation part and a target detection part. Experimental results show that the use of this method to generate training data can effectively train deep convolutional neural networks to achieve classification and detection of traffic signs in real scenes, and the optimized detection model has a smaller size and faster speed than the model mentioned in the article.

Abstract:As one of the autopilot schemes of automatic guided vehicle(AGV), electromagnetic guidance is widely used in industry, logistics and other fields. Traditional electromagnetic guidance schemes have high requirements on mechanical structure and are easily limited by the small preview range of sensors. Thus, it is difficult to apply them to small AGV. In order to remedy the defect of limited detection ability, which is caused by limited preview, a fully connected neural network model is designed and trained to detect both vehicle′s posture and rear road information. Both simulation and actual tests show that the presented scheme greatly improves the control effect of electromagnetic guidance system with small size and limited-preview sensors. In the whole process, the vehicle runs rapidly and steadily.

Abstract:At present, there are some problems such as lag and low prediction accuracy when using gated recurrent units(GRU) neural network to predict traffic. This paper proposes an improved GRU model for traffic prediction. Firstly, based on GRU neural network, a network model integrating Bi-GRU neural network and artificial neural network is proposed, which satisfies the input of multi-dimensional vectors such as traffic features, time features and event features. At the same time, in order to improve the accuracy of some time periods, the training samples are classified into date classes, and a separate network model is generated for each type of date. It can greatly improve the accuracy of prediction and improve the lag of prediction. Finally, in order to improve the prediction accuracy of peak traffic, the experimental results show that the proposed goal can be achieved by the means of sample propensity balance and user-defined loss function.

Abstract:The Internet has an important impact on people′s life and work. However, there are a large number of harmful gambling websites hidden in cyberspace, which is easy to cause losses and troubles to netizens, it can even disturb society order. Therefore, it is of great significance to study the efficient recognition method of such websites. In this paper, the deep residual neural network is used to solve the problem of gambling web page recognition, and the algorithm GamblingRec is designed based on principle of deep residual network. The results show that the accuracy of GamblingRec reaches 95.16%, and the positive sample recall rate is 93.21%,which indicates that the method based on deep residual neural network can be applied for gambling web page recognition, and can achieve high recognition performance.

Abstract:In order to solve the problem of insufficient human-computer interaction ability of ball training partner robots, a new human-computer interaction mode based on Raspberry Pi and YOLOv5 algorithm was proposed, which enabled the robot to realize six different actions: forward, backward, left, right, throwing the ball, and kicking the ball. After calibrating and training the data sets collected in three different environments(indoor, outdoor sunny day and outdoor cloudy day), the recognition accuracy of the six poses in the test set under three different environments is 96.33% indoor,95% outdoor sunny day,and 94.3% outdoor cloudy day, respectively. Compared with other algorithms based on feature matching and small target detection using gestures, the robot has higher detection speed and accuracy, which makes the robot more intelligent.

Abstract:Aiming at the efficient classification and handling of domestic waste, this article designed a photoelectric smart car system with the edge embedded AI device Jetson Nano as the controller. The system is designed with YOLOv5 as the target detection algorithm and Pytorch1.8.1 as the deep learning framework. The system makes the smart car start from the designated location, search for garbage in the designated area through its own photoelectric sensor, identify and classify the garbage, and use the six-axis robotic arm to sort the garbage and send it to the designated stacking place. 300 iterations of training were performed on the collected 5 048 pictures and 5 types of garbage. The experimental test results show that the average accuracy reaches 91.8%, the accuracy rate reaches 94.5%, and the recall rate reaches 89.03%.

Abstract:In recent years, how to recognize and analyze people′s facial expressions through artificial intelligence has become a research hotspot. Using artificial intelligence can quickly analyze people′s facial emotions, and further research is carried out on this basis. In deep learning, the traditional convolutional neural network can not extract facial expression features sufficiently, and the amount of computer parameters is large, which leads to low classification accuracy. Therefore, a facial expression recognition algorithm based on VGG16 neural network is proposed. Compared with the model experiments of InceptionV3, InceptionResNetV2 and ResNet50, the results show that the recognition accuracy of VGG16 neural network on FER2013PLUS test data set is 79%, which is higher than that of traditional convolution neural network.

Abstract:In the automatic detection, the road damage data set has the problems of difficult detection of small target damage and imbalance of categories, resulting in low accuracy and high false rate of road damage detection. For this reason, based on the DSSD(deconvolutional single shot detector) network model, a road damage detection algorithm combining attention mechanism and Focal loss is proposed. First of all, ResNet-101 with higher recognition accuracy is used as the basic network of the DSSD model. Secondly, an attention mechanism is added to the ResNet-101 backbone network, and the channel domain attention and spatial domain attention are combined to achieve the weighting of features in the channel dimension and the focus on the spatial dimension, and improve the detection effect of small target road damage. Finally, in order to reduce the weight of simple samples and increase the weight of difficult-to-classify samples, Focal loss is used to improve the overall detection effect. It is verified on the data set provided by the Global Road Damage Detection Challenge competition. The experimental results show that the average accuracy of the model is 83.95%, which is more accurate than the road damage detection method based on SSD and YOLO network.

Abstract:The analysis of public policy is vitally significant in administrative study. With the wide application of deep learning and knowledge graph, any effort to improve technique method in this area is likely to be of great benefit. Therefore, applying nature language understanding and other technology in industrial policy research is conducive, and will be critical to policy formulation and management. This article proposes a set of technical analysis method for software industry policy, aiming at constructing the relational relationship network among policies to assist government decision-making.

Abstract:The complex nonlinear separable space is composed of multi-source sensing data and its noise. Data fusion is an important method for eliminating redundant data safely, accurately and efficiently in resource-constrained sensor networks. Because of SVM generalization ability and its convex optimization, this paper focuses on the feasibility of transforming nonlinearly separable multi-source data sets into high-dimensional linear separable spaces, based on the simulation experiment. The method based on the width parameter range estimation can accurately determine the width parameter of Gaussian kernel. For the multiple classification, the stimulation experiment show, by controlling the accumulation of errors,it is more effective to ensure the classification.

Abstract:The planning of infrastructure AIops scenario for China Mobile private cloud is described, and the two typical scenarios named "Intelligent Index Anomaly Detection" and "Intelligent Alarm Traceability" are researched. The algorithm and business processes of the two scenarios are introduced respectively.The effect evaluation method of the two scenarios is discussed,and the actual production verifies the implementation effect.

Abstract:Object detection algorithms based on deep learning are difficult to deploy on low computing power platforms such as mobile devices due to their complexity and computational demands. In order to reduce the scale of the model, this paper proposed a lightweight object detection algorithm. Based on the top-down feature fusion, the algorithm built a feature pyramid network by adding an attention mechanism to achieve more fine-grained feature expression capabilities. The proposed model took an image with a resolution of 320×320 as input and had only 0.72 B FLOPs, achieved 74.2% mAP on the VOC dataset and the accuracy is similar to traditional one-stage object detection algorithms. Experimental data shows that the algorithm significantly reduces the computational complexity of the model, maintains the accuracy, and is more suitable for object detection with low computing power.

Abstract:Considering that the current power industry still lacks effective domain word discovery methods, this paper takes the power industry science and technology project text as the original corpus, combines the statistical features based on the mutual information, left entropy as well as right entropy with the features of traditional language word-formation rules, and proposes the new concept of power text word formation rate. The proposed method firstly uses the word formation rate to get the initial candidate word set by unsupervised filtering, and then performs the text slicing algorithm and common word filtering operation on the candidate word set, and finally performs the word embedding and spectral clustering algorithms to get the final power text-domain words. Experimental results show that the method proposed in this paper is accurate and effective, and provides a new method for power text domain word discovery.

Abstract:Focusing on the problem that the tracking accuracy of Staple tracker is reduced due to blurring of camera motion, an improved Staple tracker base on background-weight histogram is proposed. Firstly, aiming at the problem of ignoring the spatiality of the traditional histogram, it is proposed to add position to the histogram. Furthermore, making full use of the color histogram of the background area,the influence of the background information on the histogram of the target area is suppressed,this paper proposes to introduce the background-weight histogram, and completes the construction of the histogram classifier. Experiment is made on OTB2015 benchmark for comparing the proposed tracker with other 5 state-of-the-art trackers. The results show that the proposed tracker has 3.7% and 2% improvement in distance accuracy and success rate respectively.

Abstract:It is of great significance to monitor the aging state of the supporting capacitors in the power converter in real time and to find and replace the defective capacitors in time. In this paper, based on the relevant voltage and current data, through the establishment of data sets, the network model parameters and model training are determined. Finally, the neural network model based on CNN-LSTM is obtained. The accuracy of the neural network model is verified by the data sets under different working conditions. The results show that the model can reliably predict the capacitance value.

Abstract:A method and system of electronic door lock unlocking based on voice command is proposed in order to meet people′s demand for better unlocking mode. The design idea of this method is using the uniqueness of mobile phone number to identify the user′s identity, using voice recognition technology to realize the use of different voice instructions to open different door locks. The system consists of electronic door lock, mobile phone and Internet server. The method and system are designed and explained in detail. According to this, the product design work such as code writing and circuit design can be carried out. The electronic door lock based on this method and system has wide application range, convenient unlocking, better security and higher cost performance ratio, which will have an important impact on the development of electronic door lock or smart door locks industry.

Abstract:This paper forecast the financial risk of enterprises based on the financial index data of A-share enterprises in the main board market of Shanghai Stock Exchange.The samples included 1227 normal listed enterprises and 42 enterprises which have been financial warning. The data was seriously unbalanced. The problem of classifier failure in unbalanced samples was solved by resampling technology in some certain.The integrated machine learning based on Bagging was used to improve and optimize the prediction model.The highest probability of correctly selecting enterprises with financial warning was 92.86%. On this basis, the overall accuracy of the sample was improved by 5.4% after the integration of the model. The integrated model improved the financial early warning ability of listed enterprises which could provide some reference for the normal operation of enterprises and the safety investment of investors.

Abstract:In view of the large workload of manual image reading, poor image reading quality, and prone to missed inspections and wrong judgments,in this paper, the faster RCNN target detection model is applied to the detection of hepatic echinococcosis CT images. And the target detection model is improved: based on the characteristics of low image resolution and different lesion sizes, the residual network with deeper network depth(ResNet101) is used to replace the original VGG16 to extract richer image features; according to the coordinate information of the lesion obtained by the object detection model, the LGDF model is introduced to further segment the lesion to assist doctors in diagnosing the disease more efficiently. The experimental results show that the object detection model based on the ResNet101 feature extraction network can effectively extract the features of the target, and the detection accuracy is 2.1% higher than the original detection model, and it has better detection accuracy. At the same time, the coordinate information of the lesion is introduced into the LGDF model. Compared with the original LGDF model, the segmentation of hepatic hydatid lesions is better completed, the Dice coefficient is increased by 5%, and the segmentation effect is better especially for the multi cystic liver hydatidosis CT image.

Abstract:Feature engineering can automatically process and generate those highly discriminative features without human operation. Feature engineering is an inevitable and crucial part of machine learning. The article proposes a method based on reinforcement learning(RL), taking feature engineering as a Markov decision process(MDP), and proposes an approximate method based on the upper limit confidence interval algorithm(UCT) to solve the feature engineering of binary numerical data problem to automatically obtain the best transformation strategy. The effectiveness of the proposed method is verified on five public data sets. The FScore of the five public data sets is improved by an average of 9.032%. It is also compared with other papers that use finite element transformation for feature engineering. This method can indeed obtain highly discriminative features, improve the learning ability of the model, and obtain higher accuracy.

Abstract:With the development of information technology, face recognition is used more and more in payment, work and security system. In the edge computing system, in order to deal with the speed, we usually choose a smaller neural network for face recognition, which may cause the recognition rate is not very high. And in practical applications, most of them can recognize the face with high image quality, but the recognition rate is not very high for the face which is greatly affected by the light and has great changes in expression and posture. Therefore, this paper chooses the SqueezeNet lightweight network, which has a small number of layers and can be well used in edge computing system. The method of preprocessing is used to preprocess the image, and then the loss function of SqueezeNet network and the residual learning method of ResNet network are improved. Finally, through the test of LFW and IJB-A data set, it is concluded that the research method in this paper can significantly improve the recognition rate.

Abstract:While communication technology brings convenience to people, telecom fraud also increases sharply. Traditional detection methods are mainly based on data mining and statistical learning of history data. However, due to the high similarity between fraud behavior and normal business, traditional statistical methods are difficult to screen. This paper proposes to transform user communication relationship into a set of topological features and establish communication social directed graph, where vertices with statistical characteristics represent users and edges with relational characteristics represent activities between them. On the basis of the communication social graph, the potential characteristics of the communication social network are learned through the graph neural network, and the information characteristics of multiple nodes are aggregated through pooling readout mechanism, in order to identify the telecom fraud users. The validation of real communication history data shows the effectiveness of this method.

Abstract:The rapid development of edge devices and the application of deep learning are increasing, the trend of combining the two is becoming more and more obvious. The potential of AI applications for low-power edge devices has not yet been fully developed. A large number of devices hide a lot of computing power. The social and economic benefits brought by the release of its potential are very obvious. Therefore, taking the more common face detection in objective detection tasks as an example, the MTCNN face detection algorithm is improved and transplanted to a low-power embedded platform with extremely limited resources. Under certain environmental conditions, the face is finally successfully detected,and the face candidate boundingbox is drawn, it has face tracking function combined with the servo.

Abstract:With the rapid development and rapid development of computer communication network technology, network attacks and destruction emerge in various forms and emerge in endlessly. Situation awareness system provides a comprehensive guarantee for network security. Improving the stability, accuracy and rapidity of situation assessment and situation prediction modeling is an important direction of situation awareness system research. As a deep learning intelligent algorithm, deep belief network brings new direction to the accuracy and theorization of network security situation assessment and situation prediction. Considering the deep belief network algorithm, the restricted Boltzmann machine is used as the basic network, and layer by layer pre-training and fine tuning are the core parts of the network. The generalized network security situation assessment index system is constructed, and the data-driven model of situation assessment and situation prediction of computer communication network security is established. Experimental simulation is carried out through the intrusion detection data set CIC-IDS2017 to verify the accuracy and effectiveness of the model.

Abstract:In view of the large amount of public opinion information on Weibo, irregular and random changes, this paper proposes a Weibo sentiment analysis method based on TFIDF-NB(Term Frequency Inverse Document Frequency-Naive Bayes) algorithm. By coding a Weibo comment crawler based on the Scrapy framework, several Weibo comments on a hot event are crawled and stored in the database. Then text segmentation and LDA(Latent Dirichlet Allocation) topic clustering are performed. And finally the TFIDF-NB algorithm is used for sentiment classification. Experimental results show that the accuracy of the algorithm is higher than that of the standard linear Support Vector Machine algorithm and the K-Nearest Neighbor algorithm, and it is higher than the K-Nearest Neighbor algorithm in terms of accuracy and recall, and it has a better effect on sentiment classification.

Abstract:Tor is an anonymous Internet communication system based on onion routing network protocol. Network traffics generated by normal applications become hard to trace when they are delivered by Tor system. However, an increasing number of cyber criminals are utilizing Tor to remain anonymous while carrying out their crimes or make illegal transactions. As a countermeasure, this paper presents a method able to identify Tor traffics and thereby recognize related Tor hosts. The method proposes several groups of features extracted from network traffic and resort to machine learning algorithm to evaluate feature effectiveness. Experiments in real world dataset demonstrate that the proposed method is able to distinguish Tor flows from normal traffics as well as recognize the kind of activity in Tor generated by different normal applications.

Abstract:Convolutional neural network plays an important role in various fields, especially in the field of computer vision, but its application in mobile devices is limited by the excessive number of parameters and computation. In view of the above problems, a new convolution algorithm, Group-Shard-Dense-Channle-Wise, is proposed in combination with the idea of grouping convolution and parameter sharing and dense connection. Based on the PeleeNet network structure, an efficient lightweight convolutional neural network, GSDCPeleeNet, is improved by using the convolution algorithm. Compared with other convolutional neural networks, this network has almost no loss of recognition accuracy or even higher recognition accuracy under the condition of fewer parameters. In this network, the step size s in the channel direction of convolution kernel in the 1×1 convolutional layer is selected as the super parameter. When the number of network parameters is smaller, better image classification effect can be achieved by adjusting and selecting the super parameter appropriately.

Abstract:Based on the idea of software and hardware co-design, this article uses HLS tools to design and implement a convolutional neural network accelerator on the PYNQ-Z2 platform, and uses the matrix cutting optimization method for convolution operations to balance resource consumption and computing resources , so that the performance of the accelerator is optimized. This article uses the MNIST data set to test the performance of the accelerator IP core. The experimental results show that: for a single image test, the accelerator achieves an acceleration effect of 5.785 compared with the ARM platform, and an acceleration of 9.72 for a 1000 image test. As a result, as the number of test images continues to increase, the performance of the accelerator will become better and better.

Abstract:In order to extract user requirements effectively in the field of car styling intelligent design, a method for extracting car styling requirements based on multi-feature TFIDF(word frequency-inverse file frequency) text analysis is proposed. Firstly, a large number of unregistered professional vocabularies is obtained through mutual information and boundary degrees of freedom to optimize the vocabulary after simple word segmentation. Next, in order to solve the problem of the limitations of the classic TFIDF algorithm, vocabulary and emotional feature factors are introduced to get user demand feature candidates set. Finally, effective user needs are obtained according to the threshold. The experimental results show that the multi-feature TFIDF text analysis algorithm has certain advantages in feature extraction, and can effectively extract user needs of the car styling in the text.

Abstract:Swarm intelligence has significant advantages in solving nondeterministic polynomial(NP) problems or problems with too much search space. In this paper, pigeon inspired optimization(PIO) is applied to the feature selection of intrusion detection systems. The Sigmoid-based PIO(SPIO) and Cosine-based PIO(CPIO) algorithms were proposed to select the features of the intrusion detection data set KDDCUP99 and conduct experiments with the method of machine learning to build the model and evaluate the results.

Abstract:Defect detection is of great significance for the protection and repair of ancient buildings. The traditional floor tile defect detection has been subject to visual inspection, which has limitations due to human influence and time-consuming. Based on the good application prospects of deep learning, this paper builds a data set of imperfections in the Forbidden City, and proposes an improved Faster R-CNN. Firstly, the deformable convolution was constructed, and the defect features in the floor tile were learned and extracted through the network. Then,the feature graph was input into region proposal network to generate the candidate region box, and the generated feature graph and candidate region box was pooled. Finally, the defect detection results were output. Under the test of the image data set of floor tiles of the Forbidden City, the mean accuracy of the improved model reached 92.49%, which was 2.99% higher than the Faster R-CNN model and more suitable for the floor tile defect detection.

Abstract:This paper deeply analyzes the shortcomings of the current reversing image system in the actual use scene, introduces the limitations of radar in radiation surface, angle blind area and object type recognition, and the millimeter wave radar is difficult to popularize due to high cost. This paper puts forward the design of automobile reverse anti-collision system based on ADAS, makes an in-depth analysis of the principle of the system′s reverse function, and introduces the ADAS algorithm based on the image analysis of the reverse camera in detail its technical key points and the innovative application based on the fusion features of the reverse scene. The algorithm recognition effect is accurate and effective. According to the differences of the installation position and orientation of different vehicle cameras, the innovation design is carried out lexible early warning and alarm area adjustment function, combined with TTS voice reminder and graphic display to remind the driver to take appropriate operation to ensure driving safety, greatly improved the user experience and practical role of the reversing anti-collision system, and provided strong protection for the driver′s safe driving.

Abstract:In this paper, a micro-expression and vehicle status recognition method based on blocked local binary pattern from three orthogonal planes(LBP-TOP) features and weighted sparse representation as the classifier is proposed. First of all, the effective block is selected from the blocked image. Then, the features, which are extracted from LBP-TOP feature descriptor, are used as a dictionary. Then the combined weighted sparse representation(WSRC) and the dual augmented lagrangian multiplier(DALM) algorithm performs sparse representation classification. Finally, the images are divided to different sizes blocks, then the effective block is chosen from these blocks, and the features are merged as the input to the classifier. The experiments are carried out on the CASME Ⅱ,SAMM and vehicle databases using leave-one-subject-out cross validation(LOSOCV). When classifying the micro-expressions into five categories, the classification accuracy can reach separately 77.30% and 58.82%, and the experiment on the database of vehicle state detection reaches 84.60% detection rate. Experimental results show the effectiveness of the proposed algorithm.

Abstract:Image quality degradation caused by rain streaks seriously affects the effective application of image and computer vision algorithm, so image deraining is very necessary. At present, mainstream deraining methods based on deep learning are only effective for single size rain streaks, and there are problems such as incomplete rain streaks removal and fuzzy background. Aiming at these difficulties, a single image deraining algorithm based on deep controlled dense connection network is presented. Through the introduction of multi-scale block, the ability to extract rain streaks of different sizes was enhanced. And attention mechanism module was injected to pay more attention to raining areas. What is more, controlled dense connection block was also introduced to fully represent the rain streaks characteristics. Experiments show that the proposed method outperforms some mainstream methods both on the synthetic dataset and the real dataset.

Abstract:Dermatosis is a common and multiple disease in medicine, so skin detection technology has attracted more and more attention. Convolutional neural network is a common skin detection method, and its model structure will lose a lot of information. CapsNet is a new kind of neural network after convolutional neural network. The vectorization of CapsNet can better express the spatial relevance, with each capsule serving its own mission independently. This paper analyzed the basic structure and main algorithm of CapsNet, the network model was improved to avoid over fitting, and tried to identify the pre-processed skin image based on improved CapsNet, and compared it with the model of traditional convolutional neural network. Experimental results show that improved CapsNet can be used to identify pigmented skin diseases with good effect, and the accuracy is about 8~10 percent higher than the traditional method.

Abstract:In the field of speech enhancement, deep neural network can improve the enhancement ability of the model by training and modeling a large number of data with different noises in the supervised learning way. However, the acquisition cost of different types of noise is large and the noise types are difficult to be comprehensive, which affects the generalization ability of the model. Aiming at this problem, this paper proposes a noise data augmentation method based on generative adversarial network(GAN), which learns from the real noise data and synthesizes virtual noises according to the data features, so as to expand the number and type of the noise data in the training set. Experimental results show that the method of noise synthesis adopted in this article can effectively expand the source of noise in the training set, enhance the generalization ability of the model, and effectively improve the signal-to-noise ratio and intelligibility of speech signal after denoising.

Abstract:Because of the technology development, the means for stealing electricity becomes more specialized and diversified. The traditional anti-theft technology is less real-time and less feasible. This paper studied the intelligent diagnosis and characteristics extract method of electricity energy meter during online operation, analyzed the abnormal electricity consumption data, used machine learning abnormality judgment thresholds based on features, and used association rule data mining methods to fuse independent detection results, realizing the mining of power theft data. At last, this paper verified the accuracy of the model establishment, and deduced the screening method of power consumption abnormal cases.

Abstract:In digitalized mines, pedestrian detection system is able to greatly reduce accident casualties, which is an essential strategy for guaranteeing workers′ well-being. In order to establish mine pedestrian detection system with high performance, a mine pedestrian detection based on side-window filter and dilated convolution is proposed. Specifically, in terms of mines environment with complicated and hostile conditions, side-window filter is adopted to suppress disturbing signals in surveillance pictures, improving image quality. In addition, considering the multi-scale characteristic of pedestrian objects, dilated convolution is introduced into model to increase receptive field of features, thus enhancing detection performance. A number of comparison experiments are conducted to illustrate the effectiveness of side-window filter and dilated convolution, and the model achieves excellent performance of 94.3 mAP and 99.1% of detection accuracy on the mine dataset.

Abstract:Traditional convolutional neural network quantization algorithms widely use symmetric uniform quantization operations to quantize models′ weights, without taking into account the correlation between the quantization of adjacent weights, that is, the quantization noise generated by the quantization operation of the previous weight can be made up after adjusting the quantitative direction of the next weights. Aiming at the above problems, a ternary convolutional neural network quantization algorithm based on the idea of weight interaction is proposed, the model compression ratio is 16 times. On the ImageNet dataset, the model prediction accuracy of ternarized AlexNet and ResNet-18 network only decrease less than 3%. This method achieves a high model compression ratio, has higher accuracy, and can be used to transplant convolutional neural networks to mobile platforms with limited computing resources.

Abstract:Compared with single image super-resolution, video super-resolution needs to align and fuse time series images. This frame-recurrent-based video super-resolution network consists of three parts:(1)The frame sequence alignment network extracts the image features and aligns the neighbor frames to the center frame;(2)The frame fusion network fuses the aligned frames and supplements the center frame information with the neighbor frame information;(3)The super-resolution network enlarges the fused image to obtain the final high-definition image. Experiments show that, compared with existing algorithms, video super-resolution technology based on frame loop network produces sharper images and higher quality.

Abstract:In the production of coal mines, accidents happen to workers once in a while because of absence of safety helmet. In order to establish digital safety helmet detection system, a wearing safety helmet detection model based on convolutional neural networks is proposed. Specifically, the model is based on advanced Darknet53 as model backbone, which is used to extract feature information from pictures. In addition, attention mechanism is introduced to enrich the propagation of information between features, enhancing the generalization of model. Finally, a wearing safety helmet pre-training dataset and a real mine scene dataset are built, and comprehensively comparative experiments are conducted on PyTorch platform to verify the effectiveness of the model designs, which achieves an excellent performance of 92.5 mAP on the real mine scene dataset.

Abstract:Aiming at the problem that the vehicle logo detection has long detection time, low detection rate and few identifiable types, a method using You Only Look Once(YOLOv3) network is proposed. In order to make the network suitable for vehicle target detection of small targets, the target feature extraction structure Darknet-53 is replaced with Darknet-19 and the multi-scale prediction layer is reduced to two layers to reduce the number of network parameters. At the same time, in order to increase the proportion of the car logo in the image and let the convolutional neural network can learn more car logo features, this paper adopts a method of cutting the vehicle from the image and manually marking it, constructing a class of 46 vehicles Data set(VLDS-46). The experimental results show that when the model is used for vehicle logo detection, the real-time requirement can be achieved while achieving high detection rate, and the average detection time is 9 ms.

Abstract:This paper proposes an adaptive weighted fusion based on audio-video matching layer.In the case of different degree of noise, the recognition degree of image and sound will decrease with the increase of noise.And the weight of the two modes is different, the stability effect of the fusion system is also different.The adaptive weighted fusion of two modes can not only make up the advantages and disadvantages of different biological modes, but also choose the optimal weight to make the decision.Experiments show that the proposed method is feasible and has higher recognition rate and robustness than single mode identification.

Abstract:Traditional road crack recognition methods are based on R-CNN, SPPnet, HOG+SVM and other methods, but the recognition accuracy is low and the detection speed is slow. In view of these shortcomings, a road crack recognition method based on Faster R-CNN is proposed. Firstly, road crack images were collected to build Pascal VOC data set. Secondly, the TensorFlow deep learning framework developed based on Google trains the Faster R-CNN with data sets and analyzes various performance parameters. The experimental results show that the training loss can be reduced to 0.188 5 and the AP value can reach 0.780 2 in the case of 20 000 iterations, achieving good results.

Abstract:Spoofing face can be used to deceive face authentication system for illegal purposes, and thus it poses a serious threat to the face recognition system. Most of the existing methods are training and testing in the same dataset, and the effect is not ideal when they are used in cross dataset scenario. In order to solve this problem, this paper proposes to use histogram of oriented gradients(HOG) to extract the cue information in the context and then send the extracted features to the one-class support vector machine(OCSVM) for training and classification. The classification results are combined with the abnormal cues detected in the context. And the algorithm is verified on the public database NUAA and CASIA-FASD. The experimental results show that the generalization ability and detection accuracy of the proposed algorithm has improved over the existing method when used in cross dataset scenario.

Abstract:In order to improve the accuracy of text similarity detection algorithm, this paper proposes a text similarity detection method combining latent Dirichlet Allocation(LDA) and Doc2Vec model, and names the model obtained by the algorithm HybridDL model. This algorithm obtains the document vector through Doc2Vec training of the document, and then obtains the probability of the occurrence of the document topic and the feature words under each topic with the LDA model, calculates the probability weighted sum of each topic and feature words in the document, and maps them to the Doc2Vec document vector. Experimental results show that the new algorithm model is more sensitive to the judgment of similar text than the traditional Doc2Vec model, and has higher accuracy in the detection of text similarity.

Abstract:Existing text similarity measurements often use the TF-IDF method to model texts as term frequency vectors without considering the structural features of texts. This paper combines the structural features of texts with the TF-IDF method and proposes a text similarity measurement for science and technology project texts. This approach firstly pre-processes a text and extracts module texts according to its structural features. After applying the TF-IDF method to these extracted module texts, this method extracts the top keywords of each module text, obtains its feature vector representation, and finally uses cosine formula to calculate the similarity of two texts. By comparing with the TF-IDF method, experimental results show that the proposed method can promote the evaluation metrics of F-measure.

Abstract:Camouflage face recognition has great application value in the field of criminal investigation and security. Aiming at the shortcomings of few researches on camouflage face recognition and weak robustness at present, a camouflaged face recognition algorithm based on deep neural network is proposed. The SqueezeNet network model has been improved and combined with the FaceNet network architecture for identity recognition of face images. By adding camouflage face images in the training data set, the network can learn the characteristics of the camouflages. The experimental results show that the recognition accuracy of the algorithm is close to 90%, which is better than other network models.

Abstract:The traditional turnout fault detection method not only leads to consume a lot of manpower, material resources and financial resources, but also relies on manual experience. With the rapid development of artificial intelligence, designing an intelligent diagnostic system to diagnose the turnout is a key problem. In this paper, an intelligent detection system is proposed, which contains data preprocessing, feature extraction, switch intelligent classifier and more suitable evaluation criterion design. It is simulated by MATLAB, the experimental results on Guangzhou village station switch current data of model W1902# and model W1904# shows that the current intelligent detection method not only has the ability of self-learning, but also can be detected efficiently in the complex changes of the environment, and the recognition time is only 0.04 s, which meets the real-time requirement of railway.

Abstract:Aiming at the shortcomings of high-dimensional sparseness in the short text of Weibo on traditional topic detection methods, a K-means Weibo topic discovery model based on feature fusion was proposed. In order to better express the semantic information of Weibo topics in this paper, the word-pair vector model(Biterm_VSM) co-occurring in sentences is used instead of the traditional vector space model(VSM), and combined with the topic model(Latent Dirichlet Allocation,LDA) to mine the potential semantics of Weibo short text, merging features obtained from the two models, and applying K-means clustering algorithm to discover topics. The Experimental results show that compared with the traditional topic detection method, the model′s adjusted Rand index(ARI) is 0.80, which is 3%~6% higher than the traditional topic detection method.

Abstract:Artificial neural network(ANN) is used to extract scattering parameters and noise parameters of GaAs high electron mobility transistors with different frequency bands and gate widths. Based on the two neural networks, the two groups of scattering parameters and noise parameters are trained and studied respectively. The average relative error and mean square error are obtained by comparing different hidden layers and the number of neurons. It is found that 8-8-6 and 6-4 correspond to the optimal hidden layers and number of neurons of the neural networks with scattering parameters and noise parameters. The test results show that the average relative error of scattering parameters is 2.79%. Compared with the conventional single neural network structure, the average relative error is increased by 31.3%. This shows that the model in this paper has better accuracy and reliability, which shows that this model has higher accuracy and is very suitable for parameter extraction of RF transistors with wide band gap and strong nonlinearity.

Abstract:A new deep learning model based on Seq2Seq and Bi-LSTM is proposed for Chinese text automatic proofreading. Different from the traditional rule-based and probabilistic statistical methods, a Chinese text automatic proofreading model is implemented by adding Bi-LSTM unit and attention mechanism based on Seq2Seq infrastructure improvement. Comparative experiments of different models were carried out through the open data sets. Experimental results show that the new model can effectively deal with long-distance text errors and semantic errors. The addition of Bi-RNN and attention mechanism can improve the performance of Chinese text proofreading model.

Abstract:The traditional method for feature extraction contains limited discriminant features, and the deep learning method need lots of labeled data and it′s time-consuming. This paper presents a method which fuses the deep and shallow features for face recognition. Firstly, the HOG feature is extracted from each images and the dimensionality reduction is followed; and the PCANet feature is extracted simultaneously and its′ dimension is reduced. Secondly, the fusion of the two types of features is conducted and discriminant features are extracted further. Finally, the SVM is adopted for classification. Experiments on the AR database verify the effectiveness and robustness of the proposed method.

Abstract:A convolutional neural network(CNN) inference system is designed based on the FPGA platform for the problem that the convolutional neural network infers at low speed and it is power consuming on the general CPU and GPU platforms. By computing resource reusing, parallel processing of data and pipeline design, it greatly improved the computing speed, and reduced the use of computing and storage resources by model compression and sparse matrix multipliers using the sparseness of the fully connected layer. The system uses the ORL face database. The experimental results show that the model inference performance is 10.24 times of the CPU, 3.08 times of the GPU and 1.56 times of the benchmark version at the working frequency of 100 MHz, and the power is less than 2 W. When the model is compressed by 4 times, the system identification accuracy is 95%.

Abstract:Traditional face detection algorithms often cannot extract useful detection features from the original image, and convolutional neural networks can easily extract high-dimensional feature information, which is widely used in image processing. In view of the above shortcomings, a simple and efficient deep learning Caffe framework is adopted and trained by AlexNet network. The data set is LFW face dataset, and a model classifier is obtained. Image pyramid transformation is performed on the original image data, and feature graph is obtained by forward propagation. The inverse transformation yields the face coordinates, uses the non-maximum suppression algorithm to obtain the optimal position, and finally reaches a two-class face detection result. The method can realize face detection with different scales and has high precision, and can be used to construct a face detection system.

Abstract:This paper proposes to apply Transformer model in the field of Chinese text automatic proofreading. Transformer model is different from traditional Seq2Seq model based on probability, statistics, rules or BiLSTM. This deep learning model improves the overall structure of Seq2Seq model to achieve automatic proofreading of Chinese text. By comparing different models with public data sets and using accuracy, recall rate and F1 value as evaluation indexes, the experimental results show that Transformer model has greatly improved proofreading performance compared with other models.

Abstract:Image classification is to distinguish different types of images based on image information. It is an important basic issue in computer vision, and is also the fundamental for image detection, image segmentation, object tracking and behavior analysis. Deep learning is a new field in machine learning research. Its motivation is to simulate the neural network of the human brain for analytical learning. Like the human brain, deep learning can interpret the data of images, sounds and texts. The system is based on the Caffe deep learning framework. Firstly, the data set is trained and analyzed, and a model based on deep learning network is built to obtain the image feature information and corresponding data classification. Then the target image is expanded based on the bvlc-imagenet training set model. And finally,"search an image with an image" Web application is achieved.

Abstract:In order to serve the customer service intelligent dialogue system of the State Grid Customer Service Center, it is necessary to extract knowledge from a large number of documents and traditional knowledge base as well as dialog data. This paper proposes a new knowledge graph framework that integrates fact graph and event evolutionary graph, which can be based on multiple sourcea data. The constructed knowledge graph has good performance in the vertical domain of accurate question and answer, customer service system knowledge support, dialogue management guidance, knowledge reasoning and so on. New knowledge graph was put into use in the customer service center question and answer system, which changed the working mode of the customer service and greatly improved efficiency of the customer service.

Abstract:Aiming at the problem of low correct recognition rate and relying on experience to select parameters in gear box fault diagnosis by using neural network, a fault diagnosis method of gear box based on particle swarm optimization BP network is proposed. In this paper, a fault model is established by extracting characteristic parameters from gear vibration principle. The model takes eigenvector of gear box as input and fault type as output. The fault diagnosis of gear box is realized by BP neural network, probabilistic neural network and particle swarm optimization BP neural network. The simulation results show that the convergence speed of BP neural network for gear box fault diagnosis is slow, and the recognition rate of fault diagnosis is 82%. The recognition rate of probabilistic neural network model fault diagnosis is determined by selecting spreads based on experience, and the maximum recognition rate is 98%. The recognition rate of BP neural network fault diagnosis based on particle swarm optimization is 100% and adaptive ability is strong.

Abstract:This paper investigates the application of convolutional neural network(CNN) in CT image diagnosis of hepatic hydatidosis. Two types of CT images of hepatic hydatid disease were selected for normalization, improved median filtering denoising and data enhancement. Based on LeNet-5 model,an improved CNN model CTLeNet is proposed.Regularization strategy is adopted to reduce overfitting problems, dropout layer is added to reduce the number of parameters, and classification experiments are conducted on the images of dichotomous liver hydatid.Meanwhile, feature visualization is realized through deconvolution to explore the potential features of diseases. The results showed that CTLeNet model achieved good results in the classification task, and it was expected to provide auxiliary diagnosis and decision support for liver hydatidosis through deep learning.

Abstract:The vehicle tracking system on NVIDIA embedded platform Jetson TX2 is designed .Video data in YUV420 format was collected from the onboard camera and sent to the Tegra Parker hardware HEVC encoder for encoding.The output stream is encapsulated by RTP and sent by UDP broadcast.Gstreamer multimedia framework is used to develop the receiving and decoding program. Finally, the acquired video is tracked and displayed dynamically. The Yolo V2 detection algorithm is used to detect the vehicle to provide tracking objects for the tracking system. Using Meanshift method can track the detected vehicles more accurately,and adding Kalman filtering algorithm can predict the position of the target model in the current frame.The system can realize real-time encoding and transmission of ultra-high definition 4K video with frame rate of 60 f/s. The HEVC hardware encoder encoding rate in this system is three orders of magnitude larger than the PC end x265 encoder, and the PSNR is 6 dB higher than the PC end x265 encoder. It′s more suitable for intelligent transportation.

Abstract:The storage capacity of neural networks has always been a major flaw. Its storage is mainly reflected in the weight coefficient. Therefore, it is very difficult to train a neural network with a large amount of parameters. This paper intends to design an external associative memory for the neural network, which can effectively serve the neural network. The input is associated with the query and the result of the query is passed to the neural network as an auxiliary input. In addition, this paper designs a vector embedding model of natural language sentences, and assembles the model and associated memory to form an associative storage system with automatic association statement semantic vectors. The performance indicators of this system meet the design requirements.

Abstract:Aiming at the problem of tracking failure caused by rotation, occlusion and deformation of moving target in video sequence, a tracking method based on multi-region segmentation of target was proposed. The target is divided into multiple overlapping regions, and then multiple regions that are relatively stable in the tracking process are selected for positioning, and then different target region weights are used to update different template updating strategies for the tracked target. In this way, the anti-blocking and anti-rotation ability of the algorithm can be increased. Experimental results show that the proposed method is adaptive to occlusion and rotation.

Abstract:Parameter prediction of insulated gate bipolar transistor(IGBT) can effectively avoid the economic loss and safety problem caused by its failure. Based on the analysis of IGBT parameters, the paper designs a SoC hardware system of IGBT parameters prediction based on LSTM network. The system uses ARM processor as the general controller to control the call of each sub-module and the transmission of data. In the FPGA, the algorithm of matrix vector inner product is optimized to improve the data operation speed in the LSTM network.And the polynomial approximation method reduces the resources occupied by the activation function. The experimental results show that the average prediction accuracy of the system is 92.6%, the calculation speed is 3.74 times faster than the CPU, and the system has the characteristics of low power consumption.

Abstract:In recent years, convolutional neural networks(CNN) and recurrent neural networks(RNN) have been widely used in the field of text classification. In this paper, a model of CNN and long short term memory network(LSTM) feature fusion is proposed. Long-term dependence is obtained by replacing the LSTM as a pooling layer, so as to construct a joint CNN and RNN framework to overcome the single convolutional nerve. The network ignores the problem of semantic and grammatical information in the context of words. The proposed method plays an important role in reducing the number of parameters and taking into account the global characteristics of text sequences. The experimental results show that we can achieve the same level of classification performance through a smaller framework, and it can surpass several other methods of the same type in terms of accuracy.

Abstract:In this paper, an entire multifactor model has constructed, based on financial indicators. We improve the prediction of the SVM classification in the multifactor model. The ranking method is used for data preprocessing, then SVM predicts the stock return classification. Finally, the distance from data to the hyperplane is used to improve the classification predict. With this strategy, in constituent stocks of CSI500, the portfolio gains 88.96% accumulated return from 2016Q4 to 2018Q1. Technical analysis moving average(MA) and channel breakout(CB) as trading time strategies can decrease fluctuation and drawdown. High frequent data are used to re-construct the MA strategy and get lower fluctuation. This model provides a new research perspective: SVM character is used for prediction improvement, technical analysis for strategy return.

Abstract:With the development of interactive intelligence technology, dialogue system becomes more and more practical. Unlike general chat-bots, the dialogue system for specific domain is a practical dialogue system with contextual reasoning and based on knowledge. The insurance domain is a typical specific domain. This paper introduces a basic construction method of the dialogue system in insurance related domain, which can help users to construct a dialogue system in a specific domain and scene quickly and practically, and has the ability of promotion and expansion.

Abstract:Identity authentication technology has developed greatly, and there have been various fraudulent means of forging legitimate user information. Aiming at this problem, this paper proposes a deep learning face detection algorithm to analyze the difference between real face and fraud face, decentralize the real face and photo, zca whiten to noise, random rotation and other processing. At the same time, using the convolutional neural network to extract the facial features of the photos, the extracted features are sent to the neural network for training and classification. And the algorithm is verified on the public database NUAA. The experimental results show that the party reduces the calculation complexity and increases the recognition accuracy.

Abstract:Aiming at the problems of higher computational complexity and larger memory requirements of current object detection algorithm, we designed and implemented an FPGA-based deep learning object detection system. We also designed the hardware accelerator corresponding to the YOLOv2-Tiny object detection algorithm, modeled the processing delay of each accelerator module, and describe the design of the convolution module. The experimental results show that it is 5.5x and 94.6x of performance and energy gains respectively when comparing with the software Darknet on an 8-core Xeon server, and 84.8x and 67.5x over the software version on the dual-core ARM cortex-A9 on Zynq. Also, the current design outperforms the previous work in performance.

Abstract:Aiming at the low precision of multi-scale face detection caused by large passenger flow and complicated background in large places such as stations and shopping malls, a multi-scale face detection method based on RefineDet multi-layer feature map fusion is established. Firstly, the first-level network is used for feature extraction and the face position is roughly predicted on the feature maps of different scales. Then, in the second level, the feature pyramid network is used to fuse the low-level features and the high-level features together to further enhance the semantics of small-sized faces information. Lastly, the detection box is secondarily suppressed by the confidence and focal loss function to achieve accurate return of the border. In the experiment, the aspect ratio between the width and the height of the face candidate region is only set to 1:1 in order to reduce the amount of calculation and improve the face detection accuracy. Experimental results on Wider Face datasets show that the method can effectively detect different scales of human faces, and the test results of MAP(mean average precision) on the three sub-data sets of Easy, Medium and Hard are 93.4%, 92% and 84.4% respectively, in particular, the detection accuracy of small-sized human faces is significantly improved.

Abstract:The multi-instance multi-label learning framework is a new machine learning framework for solving ambiguity problems. In the multi-instance multi-label learning framework, an object is represented by a set of examples and is associated with a set of category labels. The E-MIMLSVM+ algorithm is a classical classification algorithm that uses degenerate ideas in the multi-instance multi-label learning framework. It can′t use unlabeled samples to learn and cause poor generalization ability. This paper uses semi-supervised support vector machine to implement the algorithm. The improved algorithm can use a small number of labeled samples and a large number of unlabeled samples to learn, which helps to discover the hidden structure information inside the sample set and understand the true distribution of the sample set. It can be seen from the comparison experiment that the improved algorithm effectively improve the generalization performance of the classifier.

Abstract:Time series prediction is an important part of abnormal detection of key performance indicators in data centers. For the time series, the wavelet basis function is used as the implicit layer node transfer function to construct the wavelet neural network for prediction. At the same time, the momentum gradient descent method is adopted to improve the learning efficiency of the neural network. Then the optimal solution is trained according to the particle swarm algorithm as the initial neural network parameters. The value is finally simulated using MATLAB, and the time series of key performance indicators are predicted with higher accuracy.

Abstract:For a long time, all kinds of traffic accidents have seriously affected people′s life,property safety and social and economic development. Traffic accident analysis is the investigation and study of traffic accident data. It finds out the pattern of accident trends and various influencing factors on the overall accidents and researches the relationship between them, so as to quantitatively understand the nature and internal law of accident phenomena. Based on the analysis of the text data recorded in traffic accidents, this paper proposes a text topic extraction model and technology to find drivers′ risk factors in traffic accidents,in order to solve the problem that traffic violations are difficult to excavate in the past, and to calculate the most dominant factors that affecting traffic accidents. Finally, taking the traffic accidents in Beijing as an example, combining with the experience of traffic management experts, the effectiveness of the proposed model is verified. It turns out that the model is valid, and the conclusion with using it is consistent with the long-term management experience.

Abstract:Aiming at the performance degradation of objective methods for image quality assessment in practical application scenarios, a visual saliency and complementary features method based on pooling of structure and energy by integrating human visual characteristics into many parts of image feature processing is proposed. Firstly, the three complementary features of image gray energy, contrast energy and gradient structure are processed based on spatial-frequency joint transformation, according to the human eye characteristics. Secondly, multichannel information of the above three layers of visual feature is extracted and assessed, respectively. Finally, the visual feature assessment of each layer is adaptively pooled from the inner layer to the outer layer based on visual characteristics and image distortion. The experiments show that the proposed method holds higher level, better stability, and assessment performance is improved in practical application scenarios.

Abstract:In order to verify the assume that stock price movement is similar to the past,pricing movement is simply dividend into up and down by K-Nearest Neighbor algorithm for forecasting. Sliding window method is used for comparing which historical period is more similar to the current in data feature. Multiple KNN models construct ensemble models for the strategy generalization and return adjustment. The CSI500 price is used for verification. With the predication, single KNN model wins 76.72% return with fee return from 2017 to Sep. 2018,remote historical period is more similar to the current in data feature,and ensemble models are better in risk control. This model verifies the stock price is similar with K-Nearest Neighbor character, which could be used as an investment timing strategy.

Abstract:Face recognition technology is an important research field for deep learning. In order to overcome the shortcomings of traditional open-loop face cognition mode and deep neural network structure, and to imitate human cognition model of real-time evaluation of cognitive results to self-optimized regulate feature space and classification cognition criteria, drawing on the theory of closed-loop control theory, this paper explores an intelligent face cognition method with deep ensemble learning and feedback mechanism. Firstly, based on the DEEPID neural network, an unstructured feature space of face images with a determined mapping relationship from the global to the local is established. Secondly, based on feature separability evaluation and variable precision rough set theory, a face cognition decision information system model with unstructured dynamic feature representation is established from the perspective of information theory, to reduce the unstructured feature space. Thirdly, the ensemble random vector functional-link net is used to construct the classification criterion of the reduced unstructured feature space. Finally, the face cognition result entropy measure index is constructed to provide a quantitative basis for the self-optimization adjustment mechanism of face feature space and classification cognition criteria. The experimental results show that the proposed method can effectively improve the recognition rate of face images compared with the existing methods.

Abstract:With the development of computer technology, fire image processing technology combining computer vision, machine learning, deep learning and other technologies has been widely studied and applied. Aiming at the complex preprocessing process and high false positive rate of traditional image processing methods, this paper proposes a method based on deep convolutional neural network model for fire detection, which reduces complex preprocessing links and integrates the whole fire identification process into one single depth neural network for easy training and optimization. In view of the problem of fire detection caused by similar fire scenes in the identification process, this paper uses the motion characteristics of fire to innovatively propose the combination of fire frame position changes before and after the fire video to eliminate the interference of lights and other similar fire scenes. After comparing many open learning open source frameworks, this paper chooses Caffe framework for training and testing. The experimental results show that the method realizes the recognition and localization of fire images. This method is suitable for different fire scenarios and has good generalization ability and anti-interference ability.

Abstract:In the process of approaching the landing, the instrument landing system(ILS) is vulnerable to the external environment and airspace, resulting in the problem of reduced navigation accuracy. This paper proposes an inertial navigation system(INS) and GBAS landing system(GLS). The improved combined navigation algorithm uses the difference between the output position information of the integrated navigation system as the measured value of the improved unscented Kalman filter(UKF) of the BP neural network, and obtains the global optimality estimated value of the system through the optimal weighting method. Compared with the traditional federated filtering algorithm, the proposed algorithm can effectively reduce the measurement noise, reduce the error when the aircraft approaches the landing, and improve the navigation accuracy.

Abstract:With the rapid development of Internet applications and the rapid growth of users , the reviews and opinions of stock market largely reflect the quotation of the stock market,simultaneously it affects the ups and downs of the stock market. Therefore, how to quickly and efficiently analyze the attitudes and opinions of netizens to the stock market,which,this question plays important role in guiding us to predict the stock market. The thesis studies the rising and falling trend of stocks by analyzing the emotional polarity of different professional issuing stocks. This paper proposes a sentiment analysis method based on a dictionary of consistent integrated financial phrases and weighted at the end of paragraph, which can solve the dependency problem of sentiment dictionary on the domain,and it can effectively improve the accuracy of sentiment analysis. In addition, this paper also proposed a windowed stock prediction model, which can be used to analyze the optimal value of the forecast event window. The experimental results shows that it will be better to predict the rising or falling trend of a particular stock just based on the stock market sentiment analysis.

Abstract:Automatic modulation recognition of the multi-system communication signals based on feature extraction and pattern recognition is an important research topic in the field of software radio. It′s one of the key technologies for a complex electromagnetic environment in the field of non-cooperative communications, such as spectrum management, spectrum detection. A new algorithm for communication signals automation modulation recognition based on deep learning is proposed in this paper. It utilizes the autoencoders for feature extraction to obtain feature set with high anti-interference ability, then classifies and identifies the selected features with BP neural network. The algorithm can realize the automatic identification for MQAM communication signal modulation. Simulation results demonstrate that the propsoed algorithm has a good performace in classification and recognition, meanwhile effectively improving the anti-interference ablility of the automatic identification of the digital modulation signal.

Abstract:Aiming at the problem that the national network customer telephone voice recognition has poor recognition of core words in specific fields, this paper proposes a method based on HCLG domain weight enhancement and domain word correction, which can add domain words in real time and quickly, to dynamically optimize the language model and improve speech recognition. The model and algorithm are optimized in the various fields of the telephone voice consultation, maintenance, complaints, etc. of the State Grid Customer Service Center. The speech recognition results have been greatly improved.

Abstract:The maturity of 4G network technology makes users′ business demand for operators higher and higher. How to maintain users and cater to users′ business needs through the study of user attributes, establish convenient and fast experience service means, and build maintenance and retention system is the most important thing for the future development of China′s telecommunication operators. This paper firstly analyzes the current situation of mobile users to maintain the development, and puts forward the user maintaining development attributes. Secondly, a data mining method based on data mining is used to analyze the data mining model, which is based on the user′s stability and user value. Finally, further prospects are put forward on how to carry out multi-channel precise push for stock maintenance.

Abstract:Since the rain line stripes in the image have different shapes and sizes and are unevenly distributed, the rain density of the single neural network learning uneven distribution is weak, and the rain removal effect is not significant. This paper proposes a rain density sensing guide expansion network to remove rain from a single images. The network is divided into two parts. The first part is the rain density perception network classifying the images of different density rains(Heavy rain, Medium rain, Light rain). The second part is the expansion network guided by the joint rain density perception classification information learning different rain density characteristics details for detecting rain lines and removing rain. Experiments show the effectiveness of the method in the de-rain on synthetic and real data sets.

Abstract:The deep neural network is similar to the biological neural network, so it has the ability of high efficiency and accurate extraction of the deep hidden features of information, can learn multiple layers of abstract features, and can learn more about cross-domain, multi-source and heterogeneous content information. This paper presents an extraction feature based on multi-user-project combined deep neural network, self-learning and other advantages to achieve the model of personalized information. This model does deep neural network self-learning and extraction based on the input multi-source heterogeneous data characteristics,fuses collaborative filtering wide personalization to generate candidate sets, and then through two times of model self-learning produces a sort set. Finally,it can achieve accurate, real-time, and personalized recommendations. The experimental results show that the model can self-learn and extract the user′s implicit feature well, and it can solve the problems of sparse and new items of traditional recommendation system to some extent, and realize more accurate, real-time and personalized recommendation.

Abstract:Taking mobile robot visual navigation as the application background, an improved ORB algorithm is proposed to solve the problems of feature points unevenly distributing and too many redundant features in visual SLAM. Firstly, the scale-space pyramid of each image is meshed to increase the scale information. Secondly, feature points are detected, using improved FAST corner points adaptive extraction and setting region of interest. Thirdly, non-maximum suppression method is used to suppress the output low threshold feature points. Finally, feature points variance values based on region image is used to evaluate of distribution feature points in images. Experiments verify that the improved ORB algorithm has more uniform distribution, fewer output overlapping feature points and shorter run time.

Abstract:It is difficult for a single robot to perform tasks in a complex environment, so Unmanned Air/Ground Vehicle(UAV/UGV) cooperative systems have been widely concerned. In order to improve the efficiency of UAV/UGV cooperative systems, a global path planning for UGV under the target recognized by UAV was proposed. Firstly, SURF algorithm was studied in identify targets and image segmentation was applied to build a map. Then, the optimized A* algorithm was proposed in global path planning for UGV based on the information acquired by UAV. Finally, simulations were performed in a typical rescue scenario. Experiments show that SURF algorithm can achieve the accuracy, real-time and robustness of target recognition. The optimized A* algorithm can achieve the feasibility and real-time of global path planning.

Abstract:The goal of image coloring is to assign color to each pixel of the grayscale image, which is a hot topic in the field of image processing. U-Net is used as the main line network, and a fully automatic coloring network model is designed based on deep learning and convolutional neural networks. In this model, the branch line uses the convolutional neural network SE-Inception-ResNet-v2 as a high-level feature extractor to extract the global information of the image, and the Power Linear Unit(PoLU) function is used to replace the Rectified Linear Unit(ReLU) function in the network. The Experimental results show that this coloring network model can effectively color grayscale images.

Abstract:This paper propose a finger vein recognition algorithm based on the CapsNets(Capsule Network for short) to solve the problem of the information loss of the finger vein in the Convolution Neural Network(CNN). The CapsNets is transferred from the bottom to the high level in the form of capsule in the whole learning process, so that the multidimensional characteristics of the finger vein are encapsulated in the form of vector, and the features will be preserved in the network, but not in the network after the loss is recovered. In this paper, 60 000 images are used as training set, and 10 000 images are used as test set. The experimental results show that the network structure features of CapsNets are more obvious than that of CNN, the accuracy of VGG is increased by 13.6%, and the value of loss converges to 0.01.

Abstract:This paper designs a real-time recognition hardware system framework based on deep learning. The system framework uses Keras to complete the training of the convolutional neural network model and extracts the parameters of the network. Using the FPGA+ARM software and hardware coordination method of the ZYNQ device, ARM was used to complete the acquisition, preprocessing and display of real-time image data. Through the FPGA,the hardening of the convolutional neural network is performed and the image is recognized, and the recognition result is sent to the upper computer for real-time display. The system framework uses MNIST and Fashion MNIST data sets as network model hardening test samples. The experimental results show that the system framework can display and identify image data in real time and accurately under the general scene. And it has the characteristics of high portability, fast processing speed and low power consumption.

Abstract:The recognition of handwritten digits is an important part of the artificial intelligence recognition system. Due to the difference in individual handwritten numbers, the existing recognition system has a lower accuracy rate. This paper is based on the TensorFlow deep learning framework to complete the recognition and application of handwritten numbers. Firstly, the Softmax and Convolutional Neural Network(CNN) model structure is established and analyzed. Secondly, deep learning is performed on 60 000 samples of the handwritten data set MNIST, and then 10 000 samples are tested and compared. Finally, the optimal model is transplanted to the Android platform for application. Compared with the traditional Softmax model, the recognition rate based on TensorFlow deep learning CNN model is as high as 99.17%, an increase of 7.6%, which provides certain scientific research value for the development of artificial intelligence recognition system.

Abstract:In order to improve the problem of low accuracy in human behavior recognition task, a neural network based on batch normalization convolution neural network(CNN) and long short-term memory(LSTM) neural network is proposed. The CNN part introduces the idea of batch normalization, and the training data of the input network are normalized in mini-batch. After full connection, they are sent to long short-term memory neural network. The algorithm adopts the space-time dual stream network model structure. The RGB image of video data is taken as spatial stream network input, and the optical flow field image is taken as time flow network input. Then the recognition results obtained by the time-space dual-stream network are combined in a certain proportion to obtain the final behavior recognition result. The experimental results show that the space-time dual stream neural network algorithm designed in this paper has a high recognition accuracy in human behavior recognition tasks.

Abstract: This paper analyses the hot point about the “Belt and Road” initiative of American mainstream news media and studies the sentiment of related public opinion. Web crawler is used to automatically collect relevant news and filter high-frequency words to get media attention hotspots. An integrated model of automatic summary-convolutional neural network(CNN) is proposed for document-level sentiment analysis. The model firstly extracts the abstraction to remove the interference of non-important data in the original document, then the convolutional neural network is used to analyze the sentence-level sentiment, obtain the document-level emotional score based on the semantic pointing method, and the emotional fluctuation abnormal articles are analyzed twice. Contrastive experiments on real data shows that the automatic summary-CNN integrated document-level sentiment analysis model is superior to the single CNN method in sentiment analysis.

Abstract: In order to improve the problem of low accuracy in human behavior recognition task, a neural network based on batch normalization convolution neural network(CNN) and long short-term memory(LSTM) neural network is proposed. The CNN part introduces the idea of batch normalization, and the training data of the input network are normalized in mini-batch. After full connection, they are sent to long short-term memory neural network. The algorithm adopts the space-time dual stream network model structure. The RGB image of video data is taken as spatial stream network input, and the optical flow field image is taken as time flow network input. Then the recognition results obtained by the time-space dual-stream network are combined in a certain proportion to obtain the final behavior recognition result. The experimental results show that the space-time dual stream neural network algorithm designed in this paper has a high recognition accuracy in human behavior recognition tasks.

Abstract: The recognition of handwritten digits is an important part of the artificial intelligence recognition system. Due to the difference in individual handwritten numbers, the existing recognition system has a lower accuracy rate. This paper is based on the TensorFlow deep learning framework to complete the recognition and application of handwritten numbers. Firstly, the Softmax and Convolutional Neural Network(CNN) model structure is established and analyzed. Secondly, deep learning is performed on 60 000 samples of the handwritten data set MNIST, and then 10 000 samples are tested and compared. Finally, the optimal model is transplanted to the Android platform for application. Compared with the traditional Softmax model, the recognition rate based on TensorFlow deep learning CNN model is as high as 99.17%, an increase of 7.6%, which provides certain scientific research value for the development of artificial intelligence recognition system.

Abstract: This paper designs a real-time recognition hardware system framework based on deep learning. The system framework uses Keras to complete the training of the convolutional neural network model and extracts the parameters of the network. Using the FPGA+ARM software and hardware coordination method of the ZYNQ device, ARM was used to complete the acquisition, preprocessing and display of real-time image data. Through the FPGA,the hardening of the convolutional neural network is performed and the image is recognized, and the recognition result is sent to the upper computer for real-time display. The system framework uses MNIST and Fashion MNIST data sets as network model hardening test samples. The experimental results show that the system framework can display and identify image data in real time and accurately under the general scene. And it has the characteristics of high portability, fast processing speed and low power consumption.

Abstract: This paper propose a finger vein recognition algorithm based on the CapsNets(Capsule Network for short) to solve the problem of the information loss of the finger vein in the Convolution Neural Network(CNN). The CapsNets is transferred from the bottom to the high level in the form of capsule in the whole learning process, so that the multidimensional characteristics of the finger vein are encapsulated in the form of vector, and the features will be preserved in the network, but not in the network after the loss is recovered. In this paper, 60 000 images are used as training set, and 10 000 images are used as test set. The experimental results show that the network structure features of CapsNets are more obvious than that of CNN, the accuracy of VGG is increased by 13.6%, and the value of loss converges to 0.01.

Abstract: The goal of image coloring is to assign color to each pixel of the grayscale image, which is a hot topic in the field of image processing. U-Net is used as the main line network, and a fully automatic coloring network model is designed based on deep learning and convolutional neural networks. In this model, the branch line uses the convolutional neural network SE-Inception-ResNet-v2 as a high-level feature extractor to extract the global information of the image, and the Power Linear Unit(PoLU) function is used to replace the Rectified Linear Unit(ReLU) function in the network. The Experimental results show that this coloring network model can effectively color grayscale images.

Abstract: In this paper, a deep convolution neural network system is designed and implemented by FPGA hardware platform for the problem that the convolution neural network(CNN) in deep learning is slow and time consuming under the CPU platform. The system uses the rectified linear unit(ReLU) as the characteristic output activation function and uses the Softmax function as the output classifier. Assembly line and the parallelism are used for the feature operation of each layer, so that 295 convolution operations in the entire CNN can be completed in one system clock cycle. The system finally uses the MNIST data set as the experimental, and experimental results show that the training time of FPGA work on 50 MHz is 8.7 times higher than that of general-purpose CPU, and the accuracy rate of system identification after 2 000 iterations is 92.42%.

Abstract: The location of acupoints will directly affect the therapeutic effect, so we designed a prediction model of relative coordinates based on particle swarm optimization and neural network(PSO-BP), and then combined with ARM to form a system for locating human acupuncture points. Firstly, PC machine is used for MATLAB simulation training and learning. After that, the optimal weights and thresholds are saved, and the algorithm is embedded in ARM, and online prediction is transformed into offline process. The experimental results show that the BP neural network optimized by particle swarm optimization can effectively improve the local extreme defects. It can be applied to locate the location of the acupoints at the location end, and display the information of the points in LCD. After the control terminal receives the location data, it can perform the movement operation on the motor.

Abstract: Based on the second generation Artificial Intelligence Learning System(TensorFlow), this paper constructs a neural network to detect smoke images, and uses an improved motion detection algorithm to intercept the image of suspected smoke area. Combining with the PCA dimensionality reduction algorithm,the Inception Resnet V2 network model is trained to recognize the smoke characteristics under the TensorFlow platform. The algorithm realizes a large range of real-time fire detection alarm, and through experiments, it is proved that the whole detection process accurately identifies the smoke region in the video stream, which is more accurate and adaptive than the traditional smoke recognition method, and provides an effective scheme for the large range of fire smoke alarms.

Abstract: This paper proposes a design scheme for chest X-ray images analysis by using embedded technology and deep learning technology. The hardware platform of the analysis system using NIVIDIA′s Jetson TX2 as the core board, equipped with Ethernet modules, WiFi modules and other functional modules. It uses the MobileNets convolutional neural network on GPU server to train the marked chest X-ray image dataset then transplants the trained model to the Jetson TX2 core board, detecting the symptoms of pleural effusion, infiltration, emphysema, pneumothorax and atelectasis on the embedded platform. The chest X-ray image data provided by the National Institutes of Health(NIH) were tested in the trained model. Experiments have shown that this method gets higher accuracy and requires less time than other methods.

Abstract: This paper proposes a convolution neural network(CNN) for image classification, which uses overlap pooling and dropout technology to solve overfitting problem. Compared with traditional CNN,the proposal obtains better results on the CIFAR-10 dataset,where the accuracy on testing data set is about 9 percent higher than that on the training data set.

Abstract: A system of smart seeing glasses based on machine vision was proposed and designed in this work. Using Samsung Cortex-A8 architecture S5PV210 as the central processor, running on the Linux system, equipping six core modules of binocular acquisition, GPS, voice broadcast, GSM SMS, voice calls and wireless transmission were equipped to build smart seeing glasses systems hardware platform. Then after completing the target scene identification on a remote cloud server through deep learning algorithm, at last, the accurate voice guide for the blind walking in real time was implemented actually. The system test results show that the smart glasses system is not only able to make the right travel guide for the blind, it also has a certain ability to identify simple objects, which can help the blind make a simple items classification. In addition, this system also has GPS positioning, voice calls, GSM SMS and many other auxiliary functions.

Abstract: The great use of the Unmanned Aerial Vehicle(UAV) brings convenience to people, but also causes some bad effects. For instance, the UAV fly into No-fly zone which results in safety problem, and violate civil privacy due to the inappropriate use. Therefore, a UAV police system is needed to implement supervision on UAVs to contain flying randomly. Traditional identification method is used, it will cause the insufficient in flexibility and precision. This paper studies a UAV recognition algorithm based on deep learning, this method will obtain an efficient model of cognition and accomplish the classification of UAVs and non-UAVs through training a learning network modified by Convolutional Neural Networks(CNNs). The model test result shows that this method has higher expandability and the recognition rate.

Abstract: Grasping points for industrial robot of the existing production lines are fixed and the artifacts only can be placed with a fixed posture and in a fixed position. The complex industrial production requirements are hard to be satisfied with this assembly model and it is inefficient. The SCARA automatic assembly system based on vision guided is designed to improve the original system. The machine vision system is designed to realize the function of rapid identification, location and attitude determination of the artifacts. The assembly system is designed to achieve the function of precision grasping and placement of the artifacts. The image processing algorithm is actualized by the MFC program of Visual Studio and the coordinate and attitude data are sent to SCARA. The good stability and rapidity of this system are proved by the experiment results. The production requirements can be satisfied and the productivity is improved significantly by this system.