String Manipulation using Excel Functions

String Manipulation using Microsoft Excel Text Functions
String Manipulation using Microsoft Excel Text Functions

     Last month I had an interesting Twitter exchange with Microsoft Excel’s Social Media team. @MSExcel asked it’s followers to share an Excel function or formula that they felt most proud of mastering. Many responses included rather complex formulas involving a clever combination of different functions. Yet, my example simply listed Excel’s MID function. When asked why I used such a basic example I reiterated a personal belief that Excel’s text manipulation functions are greatly underused and consequently underrated. Surely, VLOOKUP function is probably the most in-demand Excel function, soon to be replaced with a much anticipated upgrade: XLOOKUP ; yet many Excel users simply overlook or don’t realize the power and usefulness of Excel’s text manipulation functions. Whether you need to combine multiple strings together, or reformat your case-sensitive text, or perhaps parse out text without using Text to Columns tool , or even remove unwanted text, or find precise position of a specific letter or word, or replace old values with the new ones, or calculate number of characters in your string; all of these tasks can easily be accomplished with the help of mighty Excel text functions. An icing on the cake is the fact that these functions are very easy to use as long as you’re aware of their existence. In this post we will cover following text functions: UPPER, LOWER, PROPER, LEN, LEFT, RIGHT, CONCATENATE and already mentioned MID Excel functions.

1. String Concatenation.

     You have a couple of equally easy options when it comes to stitching multiple strings together. A trusted CONCATENATE function takes in up to 30 text strings and individual cell references separated by a coma and outputs combined text. Note that you would need to allow for punctuation: spaces ” ” or special characters, such as a coma “,” or combinations of both; or even carriage returns CHAR(10) in your formulas. If, in fact, you are using a carriage return character, ensure that your cell is formatted to align text via the Wrap text option. Should you struggle with spelling out CONCATENATE term, you can instead use the ampersand [&] sign to separate individual strings or cells.

     Final result of the below formulas is exactly the same, so it becomes a personal preference which solution works better for you:
=CONCATENATE(“Excel”,” “,”Strategies”,”,”,” “,”LLC”)

OR

=”Excel” & ” ” & “Strategies” & “,” & ” “& “LLC”

     If you are fortunate enough to be using Excel 2019 or an Excel 365 subscriber, you can use the most flexible concatenation function which even supports ranges of cells in addition to individual cell references: CONCAT .

2. Text formatting.

     Do you ever search for a specific phrase only to realize that your query doesn’t yield any results due to case sensitivity and inconsistent data entry practices? Not only can you change case text to either UPPER , LOWER , or even a PROPER case; doing so would also make your worksheet look more professional with consistent data display rules. The syntax simply asks for the text or a cell reference to be converted to the specified case:
Text formatting functions in MS Excel

3. Calculating number of characters in a string.

     While I truly hope no-one will ever find themselves in a hellish predicament having to spell Czechoslovakia , you can count [pun intended] on Microsoft Excel to help you count number of letters in this country that no longer exists. LEN function will come to the rescue and do the monumental task of counting all 14 letters for us.

4. Extracting text strings, words or letters.

     Sometimes you simply need to retrieve a word from a larger phrase or a number of characters from the bigger word. Using RIGHT function you could specify number of characters to retrieve from the right side/end of the string – in other words: N number of last characters. As you might’ve guessed by now, LEFT function would parse out N number of first characters from the left side/beginning of a string. If you’re like me and don’t want to be constrained to either beginning or the end of a string, you could leverage MID function, which returns N number of characters starting from the position that you would specify:

= MID (text, start_num, num_chars) , where all three arguments are required :

Text – text string/cell reference containing characters to be parsed out.

Start_num – position of the first character in your retrieval string.

Num_chars – number of characters to be extracted.

     As you can see the feat of “mastering” MID function is hardly worthy of any mention, yet, the applications for this function are more than bountiful:

Using Excel's MID function

5. Putting it all together.

Exhibit A – Concatenation and formatting.

     Suppose you had a list of your employees showing their gender, first, middle, and last names; followed by their address information (address, city, state, and zip) all grouped in separate columns. For unified reporting purposes you want to combine address information into one column, show gender in proper case instead of lower case in the original data source, and combine first and middle names while fixing data entry inconsistencies of mixed lower and upper/proper case in the original data. The illustration below utilizes CONCATENATE, PROPER, and LEFT functions, and should be fairly intuitive:

Text Concatenation and Formatting in Excel

Exhibit B – Parsing text strings.

     What do you call text manipulation that’s opposite [in reverse of] of concatenation – text parsing of course! Invoking the RIGHT function to retrieve last 8 letters from Czeckoslovakia would yield slovakia, so would calling the MID function starting at the 7th position and retrieving the same 8 characters. If we wanted to write a more dynamic formula that avoids hard-coding either starting position, or the number of letters in a phrase, we could start by introducing Excel’s FIND function:

FIND(find_text, within_text, [start_num]) , with the below arguments :

Find_text – text to be found.

Within_text – text containing text to be found.

Start_num – OPTIONAL – specifies the first character for starting the search. Assumed to be 1 if omitted.

     Keep in mind that this function is case-sensitive, so consider using PROPER or UPPER or LOWER functions for best results.

     Armed with the knowledge of text functions we can now author our flexible and rather involved formula. First step is finding the position of letter “s” which, in fact is found in the 7th position – no longer do we need to hard-code the starting position argument of our MID function. When it comes to the length of the string to be retrieved we can subtract position of the letter s from the total length (number of letters) in the word. Finally, we do need to account for the letter s itself by adding it’s length [one character] to our formula. While this is rather simple logic, resulting formula does look rather formidable. Compare and contrast two versions below:

=MID(“Czechoslovakia”,7,8)

vs.

=MID(“Czechoslovakia”,FIND(“s”,”Czechoslovakia”),LEN(“Czechoslovakia”)-FIND(“s”,”Czechoslovakia”)+1)

Perhaps an illustration is worth a thousand formulas here:

Text Manipulation in Excel

Exhibit C – Parsing text – fun continues.

     If you needed to retrieve first and last names from ONE column housing names of students starting with their last name, followed by a comma and a space and ending with a first name you could consider using the Text to Columns tool, but writing formulas would likely be more rewarding and flexible solution. To extract the last name we would first find position of the comma , subtract it from the total text length and adjust for the comma character itself and retrieve the resulting number of letters from the beginning of the text. To parse out the first name sub-string, we would also find position of the comma , similarly subtract it from the total length of the the text, yet then retrieve resulting number of characters from the beginning of our master string. Resulting formulas might appear intimidating at a first glance, but applying our building blocks methodology of creating smaller, targeted formulas first and combining them next this process should become easier to replicate:

Parsing Text in Excel

      BONUS TIP: if you need to display actual formula as a text, refer to the FORMULATEXT function.

     As always, I’ve included Excel file with all formulas and functions discussed in this post for your further practice.

Have you you ever performed any string manipulation actions using Excel Text functions?

 

Leave a Reply

Your email address will not be published. Required fields are marked *