AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |
Back to Blog
Power query clean text4/17/2023 ![]() ![]() I couldn't work on the data in your test because you've linked to a local file. I've attached an excel file with the power query solution to your problem. Thanks again for all your help and sorry if these are basic questions - im still new to PBI Would be great if I could get this working in the PBIX file, in the Transform section I also attach the PBIX and source Excel file if that helps? Thanks ever so much for this - how do I view the query in the Excel file? Sorry if this sounds like a daft question If that's still giving you a headache let me know and I can probably record a short video of me doing it on Stream. The best advice I can give you is to open my excel file alongside your PBI file and look at what I've done and try to do it alongside. However, when YOU click Transform in Power BI it'll take you straight to the Power Query editor. I'm afraid I can't repeat the steps in your Test PBI file because the Transform command tries (unsuccessfully obviously) to take me to your C drive where the data source is kept. Power Query is essentially the back end to Power BI. It works in exactly the same way in Power BI. Obviously if you were to introduce a record with different delimiters you'd have to add a step to deal with them. If you added more raw records to it and then when back to the query and refreshed it, Power Query would apply the same steps to the data and clean it for you by following the Applied Steps. Sheet 1 of the spreadsheet contains your original table. See attached file which has three columns of data:ġ) Square Brackets: I would like to remove the text between the square brackets including the square bracketsĢ) Colon: I would like to remove all the text after the colon, I think I need to do a LEFT I have some data and I need to clean it - but im not sure what the best way is. Please give me shout if you're still confused. ![]() Assuming any new data you load suffers from the same problems this should clean it up as you add it to the table. I've relabelled all the steps in the query to show what I did. I couldn't work on the data in your test because you've linked to a local file.īasically you need to do a bunch of split columns by delimiter ( [, :, and respectively) and them merge the HTML columns that contain the relevant text back together again and then perform a final trim on them to get rid of the leading space. This will open up the power query editor.Subject: Cleaning data - Remove Text with Square Brackets, Colon and HTML ![]() Go to the Data tab ➜ then go to Get Data ➜ Choose From Other Sources ➜ then choose Blank Query. This is where we will place the M code for our function. The first step to creating a query function is to create a new blank query. Then it applies the Text.Trim function to take care of any leading or trailing space characters.Ĭreating a recursive function solves the problem of manually applying repeated Replace Values steps to remove double spaces and is a much more elegant and re-usable solution. This function takes a string of text and replaces double spaces with single spaces and repeats the process recursively until it doesn’t find any more double spaces. M Code For The Query Function let ExcelTrim = (TextToTrim) => let ReplacedText = Text.Replace(TextToTrim, " ", " "), Result = if not(Text.Contains(ReplacedText, " ")) then ReplacedText else in Text.Trim(Result) in ExcelTrim So we’re going to create a power query function to replicate Excel’s superior TRIM function. But more importantly, if new data comes in then the number of Replace Values steps might not be sufficient anymore. let fnCngNotStndCharsToDef (strin as text, stndChrs as list, optional defChar as text) > Text.Combine ( List.Transform ( Splitter.SplitTextByRepeatedLengths (1) (strin), each Record.FieldOrDefault (Record. It means repeating a step which takes a bit more effort. Then repeat the process until there are no more double spaces. My solution in the post was to apply the Trim command then use the Replace Values command to replace instances of double spaces with single spaces. There is also a Text.Trim function in power query, but it only takes off spaces from the start and end of a text string. So when there’s more than one space between a word, it will replace them with a single space. It will remove all space characters from the start and end of a text string, as well as remove excess spaces between words. ![]() Two solutions I looked at were the TRIM function in Excel and using Power Query.Įxcel’s TRIM function is easy to use and does a great job. The ProblemĪ while ago, I wrote a post on 4 ways to remove unwanted space characters from text data. If you find you’re always doing the same transformation or calculation, why not turn it into a function? This can simplify the steps in your query and allow you to re-use a solution. In this post we’re going to learn how to build a custom query function in power query that will replicate Excel’s TRIM function! ![]()
0 Comments
Read More
Leave a Reply. |