Do while loop
Originally posted by: karennap
Hi all,
I have an excel file list of "stop words"(words I need to remove from my text) - there are over 1000 words on the list.
I also have 22,000 text files containing text I'm able to pull into an Input raw node, this gives me a table where one of my columns is called narrative and contains all the text from the text file.
I now need to remove every stop word from this narrative field for each of my 22,000 rows.
I know a Do While Node can help me, but I'm not sure what the inputs and outputs should be. I assume the right logic here is to loop through my stop word list and do a Narrative.Replace('StopWord',''), but I'm having trouble.
Can anyone talk this through for me?
Many thanks to anyone who can help - pulling my hair out.
Hi all,
I have an excel file list of "stop words"(words I need to remove from my text) - there are over 1000 words on the list.
I also have 22,000 text files containing text I'm able to pull into an Input raw node, this gives me a table where one of my columns is called narrative and contains all the text from the text file.
I now need to remove every stop word from this narrative field for each of my 22,000 rows.
I know a Do While Node can help me, but I'm not sure what the inputs and outputs should be. I assume the right logic here is to loop through my stop word list and do a Narrative.Replace('StopWord',''), but I'm having trouble.
Can anyone talk this through for me?
Many thanks to anyone who can help - pulling my hair out.
-
Originally posted by: stonysmith
The code below is one example of how to accomplish what you want.
I used the DoWhile function within BrainScript rather than using the DoWhile node.
Warning: this could be a bit slow. We might have to look at some other solution if it's too slow for your needs.
node:Edit_out_StopWords bretype:core::Lookup editor:Label=Edit out StopWords editor:sortkey=59c3f1844d353263 input:@40fd2c746abc6dc7/=Narrative.40fe6c55598828e5 input:@40fd2c74486e4494/=Build_StopWord_List.40fd2c744c862db0 output:@40fd2c7445835585/= prop:InputKey=<<EOX 1 EOX prop:LookupKey=<<EOX 1 EOX prop:Script=<<EOX sw=StopWords.split(",") i=0 n=Narrative while i<len(sw) { n=replace(n,sw[i],"") i=i+1 } emit n as Narrative EOX editor:XY=390,130 end:Edit_out_StopWords node:Build_StopWord_List bretype:core::Agg editor:Label=Build StopWord List editor:sortkey=59c3f0f520a54b4c input:@40fd2c7427456e5b/=StopWords.40fe6c55598828e5 output:@40fd2c744c862db0/= prop:GroupBy=<<EOX 1 EOX prop:Script=<<EOX s=groupString(StopWord,",") emit s as StopWords where lastInGroup EOX editor:XY=330,230 end:Build_StopWord_List node:Narrative bretype:core::Static Data editor:Label=Narrative editor:sortkey=59c3f1f81cda6d9f output:@40fe6c55598828e5/= prop:StaticData=<<EOX Narrative Red Green Blue Yellow Cyan Magenta Orange Chartreuse Aquamarine Azure Violet Fuchsia Grue Bleen Octarine Garrow Gendale Hooloovoo Fire Ice EOX editor:XY=250,130 end:Narrative node:StopWords bretype:core::Static Data editor:Label=StopWords editor:sortkey=59c3edfc083e6352 output:@40fe6c55598828e5/= prop:StaticData=<<EOX StopWord Red Blue Green EOX editor:XY=250,230 end:StopWords
Please sign in to leave a comment.
Comments
2 comments