A Method for Extracting Data Elements from Chinese Electronic Medical Records
	    		
		   		
		   			
		   		
	    	
    	 
    	10.3969/j.issn.1673-6036.2024.08.013
   		
        
        	
        		- VernacularTitle:中文电子病历数据元抽取方法
 
        	
        	
        	
        		- Author:
	        		
		        		
		        		
			        		Weijia GUO
			        		
			        		
			        		
			        			1
			        			
			        		
			        		
			        		
			        		
			        		;
		        		
		        		
		        		
			        		Shaoyou GUO
			        		
			        		
		        		
		        		
		        		
		        		
		        			
			        		
			        		Author Information
			        		
		        		
		        		
			        		
			        		
			        			1. 河南省图书馆 郑州 450052
			        		
		        		
	        		
        		 
        	
        	
        	
        	
        		- Keywords:
        			
	        			
	        				
	        				
			        		
				        		electronic medical records(EMR);
			        		
			        		
			        		
				        		data element;
			        		
			        		
			        		
				        		ALBERT;
			        		
			        		
			        		
				        		sequence labeling;
			        		
			        		
			        		
				        		token
			        		
			        		
	        			
        			
        		
 
        	
            
            
            	- From:
	            		
	            			Journal of Medical Informatics
	            		
	            		 2024;45(8):78-83
	            	
            	
 
            
            
            	- CountryChina
 
            
            
            	- Language:Chinese
 
            
            
            	- 
		        	Abstract:
			       	
			       		
				        
				        	Purpose/Significance A method is proposed for extracting data elements from electronic medical records(EMR)based on national standards,helping to achieve fine-grained sharing of EMR data.Method/Process The ALBERT,BILSTM and CRF models are used to perform sequence labeling on EMR,and a set of candidate data elements based on labeling results are generated.For any can-didate data elements,the contextual information is collected to form an enhanced key vector.Then the similarity between the vector and the standard vector is calculated to determine whether the candidate data element is valid.Result/Conclusion The F1 value is 90.32%,indicating the proposed method has a good performance.