ASP.NET

 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
User Name:
Password:
Remember me



Go Back   Tutorialized ForumsWeb Design & DevelopmentASP.NET

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Rate Thread Display Modes
 
Unread Tutorialized Forums Sponsor:
  #1  
Old December 16th, 2016, 05:43 AM
Sneha123 Sneha123 is offline
Registered User
Tutorialized Newbie (0 - 499 posts)
 
Join Date: Dec 2016
Posts: 3 Sneha123 User rank is Just a Lowly Private (1 - 20 Reputation Level) 
Time spent in forums: 10 m 40 sec
Reputation Power: 0
Text search and extraction in pdf file

I am working for text search and extraction from pdf using third party dll itextsharp.
I am getting the text on searching but not only that text, the whole text of that page.
I thought to use phrases or chunks so that I can get pre-and post of that text only along with it instead of whole page text. Can anyone suggest me code for phrases or anything else which I can use for it. Thanks!
My code is:
Code:
string searchText = null;
            string filename = System.AppDomain.CurrentDomain.BaseDirectory;
            filename = @"C:\test.pdf";
            searchText = textBox.Text.ToString();

            
            List<int> pages = new List<int>();
            if (File.Exists(filename))
            {
                PdfReader pdfReader = new PdfReader(filename);
                List<Phrase> PhraseList = new List<Phrase>();

                for (int page = 1; page <= pdfReader.NumberOfPages; page++)
                {

  ITextExtractionStrategy strategy = SimpleTextExtractionStrategy();
  string currentPageText = PdfTextExtractor.GetTextFromPage(pdfReader, page, strategy)

                    if (currentPageText.Contains(searchText))
                    {
                        pages.Add(page);
                        textBox1.AppendText(PdfTextExtractor.GetTextFromPa  ge(pdfReader, page));
                        textBox1.Text += pages.ToString();
                    }
                }
                pdfReader.Close();
            }

Reply With Quote
Reply

Viewing: Tutorialized ForumsWeb Design & DevelopmentASP.NET > Text search and extraction in pdf file


Developer Shed Advertisers and Affiliates


Thread Tools  Search this Thread 
Search this Thread:

Advanced Search
Display Modes  Rate This Thread 
Rate This Thread:


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
View Your Warnings | New Posts | Latest News | Latest Threads | Shoutbox
Forum Jump


Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
  
 

Powered by: vBulletin Version 3.0.5
Copyright ©2000 - 2017, Jelsoft Enterprises Ltd.

© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap