Love how Snowflake customers are using the new #Cortex #AI #LLM Functions in creative ways to solve even basic #DataEngineering problems which would normally require custom UDF functions to be written in #Python, #Java or #Scala. Here is an example of using #cortex_complete via #mistral_7b to convert standard US Addresses into an array of individual parts (Street No, City, Zip & etc.) then using #PARSE_JSON() + #GET(Array, N) functions to split array attributes into different columns. SELECT RawAddress, PARSE_JSON( SNOWFLAKE.CORTEX.COMPLETE( 'mistral-7b', 'Parse the given address into following array of values without any comments: [address number, street name, unit number, city, state, zip]' || ' content: ' || RawAddress))::array as AddressArray, GET(AddressArray, 0)::string as BuildingNo, GET(AddressArray, 1)::string as Street, GET(AddressArray, 2)::string as UnitNo, GET(AddressArray, 3)::string as City, GET(AddressArray, 4)::string as State, GET(AddressArray, 5)::string as Zip FROM AddressList; If there is ever a concept of TOO SIMPLE to use, I think #Snowflake is borderline there.
Any chance the quotas would increase? As of now it looks like you can only process 500 rows a minute. https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions
I recognize that sql!
Would be interesting to see how this works for more noisy data (e.g. human-typed) or addresses outside US.
I wonder if we can use something similar to dynamically parse JSON key values pairs to columns and rows
It is something incredible, I will definitely try it
Very impressive, will definitely try this out
Love your post! It really show up how simple is to use LLMs with Snowflake on real case scenario!!!
Great amount of potential here!! Thanks for posting!
Love it
Senior BI Data Developper (Analytics Engineer) at Coveo
7moI think I would prefer having the llm write sql that does it instead. It would be verifiable and a lot cheaper to run. I could see the llm being useful for more creative uses that doesn’t need to be exact though.